McGraw-Hill 

Computer  Science  Series 


About  the  Book 

"We  have  felt  strongly  that  this  book  should 
present  computers  In  their  entirety  because 
by  taking  features  out  of  context  or  making 
composite  machines  we  remove  all  the  real 
constraints  of  the  design.  Although  there 
are  a  few  exceptions,  the  book  is  about 
real  machines  which  have  been  built." 

From  the  Authors'  Preface 

Computer  Structures  utilizes  a  case  study 
approach  to  cover  40  distinct  computer 
types,  and  to  provide  a  taxonomical  frame- 
work for  analyzing  the  1 000  different  com- 
puters now  extant.  Of  the  40  computers 
described,  10  are  based  on  unpublished 
sources.  Approximately  one-third  of  the 
book  is  devoted  to  original  machine  de- 
scriptions and  new  "framework"  material. 

By  introducing  two  sets  of  notations  and 
positing  a  conceptual  framework  for  each 
computer  example,  the  authors  offer  the 
reader  the  opportunity  of  comparing  differ- 
ent computers  as  to  their  capabilities  and 
limitations.  Throughout  the  book,  the  com- 
puter is  viewed  as  a  complex  structure  com- 
posed of  hierarchies  of  simpler  structures. 
Within  this  context,  particular  computer 
examples  are  examined  by  decomposing 
them  into  their  constituent  elements.  In 
addition,  the  authors  discuss  the  two 
computer  levels  that  have  just  recently 
emerged:  (1)  the  level  of  processors,  mem- 
ories, and  switches  (PMS),  and  (2)  the  in- 
struction set  processor  level  (ISP).  These 
two  levels  have  not  been  generally  dis- 
cussed in  any  other  logical  design  text. 
This  comprehensive  view  of  computer 
hardware  structures  is  further  supple- 
mented by  circuit  and  production  details. 

Assuming  a  basic  knowledge  of  program- 
ming and  logical  design,  Computer  Struc- 
tures can  be  used  either  as  a  text  in 
computer  organization  courses,  or  as  a 
reference  for  computer  designers,  soft- 
ware system  designers,  and  programmers. 


Computer  Structures 
1  and  Examples 
ter  Science  Series) 


To: 


ase  Return  To 


Gordon  Bell 
juipfiient  Corporation 
ird,  MA  01754 

r|i*i|  />vw%  o«>(Tc«W^o 


] 


0.  G^rdoa/Beii 
553  En¥/iiff  Ed 
Pittsburgljta  J 5-2  2  ( 

'+'1  -  ivf 


"5  r-. w  c.  '•z  ^-  ^  / 1 




)  ,  /  '  -  (I 


^.  J.      ,.'•-<:  '-t  ■  \    |-,  .J-  .^1^  J^U 


>    t,  , 


Digitized  by  the  Internet  Archive 
in  2014 


https://archive.org/details/computerstructuresOObell 


Computer  Structures:  Readings  and  ExampI 


McGraw-Hill  computer  science  series 


RICHARD  W,  HAMMING 
Bell  Telephone  Laboratories 

EDWARD  A.  FEIGENBAUM 
Stanford  University 

Bell  and  Newell    Computer  Structures:  Readings  and  Examples 

Cole    Introduction  to  Computing 

Gear    Computer  Organization  and  Programming 

Givone    Introduction  to  Switching  Circuit  Theory 

Hellermaii    Digital  Computer  System  Principles 

Kohavi    Switching  and  Finite  Autotnata  Tlieory 

Liii    Introduction  to  Combinatorial  Mathematics 

Ralston    Introduction  to  Computer  Science 

Rosen    Programming  Systems  and  Languages 

Salton    Automatic  Information  Organi:Mtion  and  Retrieval 

Watson    Timesharing  System  Design  Concepts 

Wegner    Programming  Languages,  Information  Structures,  and  Machine 
Organization 


Computer  Structures:  Readings  and  Examples 

C.  Gordon  Bell 

Professor  of  Computer  Science  and  Electrical  Engineering 
Carnegie-Mellon  University 
Allen  Newell 
University  Professor 
Carnegie-Mellon  University 


McGraw-Hill  Book  Company 
New  York    St.  Louis    San  Francisco  Diisseldorf 
London    Mexico    Panama    Rio  de  Janeiro 
Singapore    S\dne\'  Toronto 


To  Brigham,  Laura, 
Paul 


Computer  Structures:  Readings  and  Examples 

Copyright  irj  1971  by  McCiraw -Hill,  Inc.  All  rights  reserved.  Printed  in  the 
United  States  of  America.  No  part  of  this  publication  may  be  reproduced, 
stored  in  a  retrieval  system,  or  transmitted,  in  any  form  or  by  any  means, 
electronic,  mechanical,  photocopying,  recording,  or  otherwise,  without 
the  prior  written  permission  of  the  publisher. 

Library  of  Congress  Catalog  Card  Number  75-109245 

07-004,357-4 

123456789  0    HDBP  79876543210 

This  book  was  set  in  News  Gothic  by  Graphic  Services,  Inc.,  printed  on 
permanent  paper  by  Halliday  Lithograph  Corporation,  and  bound  by 
The  Book  Press,  Inc.  The  designer  was  Elliot  Epstein;  the  drawings  were 
done  by  John  Cordes,  J.  &  R.  Technical  Services,  Inc.  The  editors  were 
Richard  Dojny  and  J.  W.  Maisel.  William  P.  Weiss  supervised  production. 


Preface 


The  structures  that  we  call  computer  systems  continue  to  grow  in  complexity,  m 
size,  and  m  diversity.  This  book  is  linked  firmly  to  the  nature  of  this  growth.  The 
book  is  about  the  upper  levels  of  computer  structure:  about  instruction  sets,  which 
define  a  computer  system  at  the  programming  level;  and  about  organizations  of 
processors,  memories,  switches,  input-output  devices,  controllers,  and  communica- 
tion links,  which  provide  the  ultimate  functioning  system.  These  levels  are  just 
emerging  into  well-defined  systems  levels — with  developed  symbolic  techniques  of 
analysis  and  synthesis  and  accumulated  engineering  know-how,  all  expressed  in  a 
crystallized  representation.  These  aspects  of  computer  systems  have  always  existed, 
of  course,  but  only  in  rudimentary  form.  The  classical  four-box  picture  of  a  com- 
puter (arithmetic  unit,  memory,  input-output,  and  control)  is  certainly  an  effective 
organization  of  components  to  process  information.  But  multiple-processors. hier- 
archies of  memories  and  remote  communications  torce  the  top  level  of  organization 
into  a  distinct  level,  requiring  analysis  and  rational  design.  Similarly,  the  25  instruc- 
tions of  the  IBM  701  computer  (developed  around  1953)  is  certainly  an  instruction 
set — indeed  one  worthy  of  study.  But  processors  with  dozens  of  registers  and 
almost  unlimited  logical  circuitry,  again  force  the  instruction  set  to  become  a  topic 
of  rational  analysis  and  design. 

This  book  IS  tied  to  the  emergence  of  these  upper  levels  of  organization:  eight 
years  ago  (a  computer  engineer's  half  dozen)  would  have  been  too  early  to  write 
this  book;  eight  years  hence  would  be  too  late.  Eight  years  ago  the  diversity  and 
complexity  of  computer  structures  was  not  sufficient  to  justify  the  attention  this 
book  provides.  This  book  would  have  been  too  thin.  Eight  years  hence  textbooks  will 
exist  that  treat  these  levels  systematically.  This  book  will  then  appear  too  descriptive. 

But  right  now,  as  these  aspects  of  computer  structure  are  emerging,  and  with 
systematic  treatment  still  precluded,  there  is  a  need  to  make  available  material  on 
these  levels  for  systematic  reference  and  study.  Our  choice  has  been  to  present  a 
large  set  of  examples,  which  illustrate  the  various  design  options  and  structural 
possibilities,  both  in  instruction  sets  and  in  overall  configurations.  These  examples 
are  descriptions  of  actual  computer  systems,  taken  from  the  technical  literature  or 
from  technical  reports  and  manuals.  Descriptions  of  actual  systems  are  to  be  much 
preferred  over  idealized  abstractions.  The  latter  can  reflect  the  real  issues  only  after 
successful  systematization. 

Not  only  are  the  chapters  about  actual  computers,  they  present  much  detail.  The 
complexity  of  computers  resides  in  part  in  their  size  and  the  multiplicity  of  their 
parts — e.g.,  to  their  having  200  instructions  rather  than  20,  or  having  to  service 
50  Teletypes  rather  than  2.  It  seems  essential  to  describe  computer  systems  in  their 
entirety,  rather  than  via  simplified  vignettes.  Again,  this  view  stems  from  the  existing 
state  of  the  art.  Eight  years  hence,  it  will  not  necessarily  hold. 

We  fall  from  grace  on  all  the  above  principles,  providing  occasionally  descrip- 
tions of  paper  machines  and  partial  descriptions  of  partial  systems.  But  our  feeling 
that  detail  and  reality  is  important  remains.  This  is  why  this  book  is  so  large;  and  fit 
for  study  rather  than  for  reading. 


vi  Preface 


The  book  presents  a  large  number  of  examples.  Variation  needs  to  be  presented 
along  all  the  major  dimensions  that  instruction  sets  and  system  configurations 
currently  exhibit.  Thus,  as  a  glance  at  the  table  of  contents  will  show,  the  examples 
in  the  book  are  hardly  picked  at  random.  The  variation  is  empirical.  It  exists  in  the 
population  of  computers  that  have  actually  been  built.  This  characteristic  of  the 
book  stems,  again,  from  our  assessment  that  the  upper  levels  of  computer  structure 
are  still  in  an  essentially  descriptive  and  empirical  state  of  development.  However, 
as  the  book  documents,  ample  variation  occurs  in  existing  computer  systems.  The 
evidence  presented  here  should  finally  lay  to  rest  the  remarks — once  echoed  almost 
universally  and  still  heard  occasionally — that  nothing  has  happened  in  computer 
structure  since  the  von  Neumann  machine. 

Dimensions  of  variations  imply  a  framework,  for  dimensions  do  not  by  them- 
selves arise  from  a  population  of  systems.  They  require  the  aid,  witting  or  not,  of  a 
conceptual  framework.  As  the  first  three  chapters  of  the  book  testify,  we  have  most 
wittingly  created  a  framework,  and  have  had  no  hesitation  in  imposing  it  throughout 
the  book.  However,  in  keeping  with  our  view  already  expressed,  this  framework  is 
primarily  descriptive.  It  has  come  inductively  from  the  common  lore,  from  our  own 
experiences  as  designers,  and  from  the  effort  of  putting  this  book  together.  This 
attempt  at  systematization  has  given  rise  to  two  notations:  one  for  instruction  sets 
(ISP)  and  the  other  for  configurations  of  major  components  (PMS).  But,  again,  these 
notations  are  primarily  descriptive. 

So  much  for  what  the  book  actually  tries  to  provide.  What  are  our  goals  for  it? 
The  first  is  educational.  There  are  three  distinct  populations  of  professionals  whose 
education  is  to  be  served  by  this  book:  the  computer  engineer,  who  will  design 
physical  computer  systems;  the  computer  scientist,  who  is  concerned  primarily 
with  the  programming  level  and  with  various  abstract  views  of  information  processing; 
and  the  electrical  engineer,  who  sees  computer  systems  simply  as  one  part  of  a 
larger  technology. 

For  all  of  these,  we  see  no  sense  in  talking  of  elementary  versus  advanced  treat- 
ments of  computer  structure.  There  is  surely  "less"  versus  "more,"  but  consistent 
with  our  view  of  the  current  art,  no  vertical  stratification  of  education  is  possible 
in  instruction  sets  and  device  configurations.  It  is  sufficient,  in  the  present  day,  for 
these  aspects  of  computer  systems  to  become  accepted  as  worthy  of  study  in  their 
own  right. 

This  book  will  hardly  make  easy  fare  for  undergraduate  students,  who  do  not 
have  an  instructor  somewhat  skilled  in  the  art  that  is  being  taught.  However,  this 
book  is  meant  for  study.  A  good  instructor  can,  we  feel,  develop  an  excellent  course 
(or  part  thereof)  in  computer  structures,  taking  this  book  as  the  basic  material.  In 
addition  to  the  three  introductory  chapters.  Chapter  5  (on  the  DEC  PDP-8),  by 
providing  a  complete  example  of  a  computer  system  with  descriptions  at  all  systems 
levels,  helps  to  tie  the  aspects  of  computer  structure  discussed  in  this  book  to  the 
view  students  will  pick  up  from  a  traditional  course  in  logical  design. 

It  goes  without  saying  that  for  the  computer  engineer  and  designer,  the  material 
of  this  book  should  be  fully  assimilated.  In  designing  a  new  computer  system,  or 
subsystem  thereof,  he  should  be  familiar  with  all  that  this  book  has  to  offer — the 
design  choices,  the  structural  variations  possible,  the  experiments  of  the  past  and 


the  design  needs  they  attempted  to  satisfy.  Given  that  systematic  analysis  does  not 
yet  exist,  there  is  no  substitute  for  extensive,  critical  understanding  of  the  existing 
examples  of  designed  systems.  We  assume  the  student  of  computer  engineering 
comes  to  this  book  with  a  working  knowledge  of  logical  design.  He  should  find  it 
possible  to  realize  many  of  the  systems  described  in  this  book  at  the  next  lower 
levels  of  logic  structure. 

For  the  computer  scientist,  the  levels  of  computer  structure  discussed  in  this  book 
constitute  a  substantial  part  of  what  he  should  know  about  the  physical  devices  that 
underlie  his  science.  As  we  pass  downward  from  these  levels  to  lower  ones — to 
register-transfer  systems,  sequential  logic  circuits,  combinatory  circuits,  continuous 
circuits  and  on  down — the  relevance  of  each  level  gradually  fades.  The  levels  of  this 
book,  along  with  the  register-transfer  level  constitute  the  main  aspects  of  computer 
structure  that  the  computer  scientist  must  understand.  It  does  not  matter  that  they 
are,  as  yet,  basically  empirical  and  descriptive.  The  computer  scientist  undoubtedly 
will  not  be  able  to  carry  through  the  design  of  the  systems  described  in  this  book 
in  terms  of  the  lower  logic  levels,  but  this  is  not  necessary  for  an  appropriate  grasp 
of  these  upper  levels  of  computer  structure.  Indeed,  this  is  what  it  means  for  distinct 
systems  levels  to  exist. 

For  the  electrical  engineer,  this  book  undoubtedly  presents  more  examples  than 
he  cares  to  know  (or  needs  to).  But  an  appropriate  sampling,  plus  the  overview 
presented  in  the  first  three  chapters,  is  appropriate  to  give  him  some  insight  into 
the  elaborate  growth  that  has  occurred  on  top  of  the  basic  digital  technology  created 
within  electrical  engineering. 

The  student  of  systems  engineering  may  also  find  the  material  presented  here 
useful,  as  an  example  of  a  class  of  complex  systems  which  has  evolved  several 
distinct  levels  of  representation.  Again,  the  book  undoubtedly  presents  too  massive 
a  dose  of  detail  for  him,  but  the  overview  in  the  first  chapters,  plus  a  sampling 
throughout  the  space  of  computer  systems,  should  prove  highly  instructive. 

We  have  goals  for  the  book  in  addition  to  the  educational  ones.  We  think  the  book 
can  serve  as  a  useful  reference  for  the  practicing  computer  engineer.  The  time  is 
past  when  every  computer  engineer  knows  about  all  computer  systems  because  he 
has  lived  through  all  of  computer  history.  That  position  is  now  reserved  for  those  of 
us  who  are  past  forty  (and  still  active).  For  the  rest,  a  source  book  that  provides  the 
cumulated  design  experience  of  the  field  is  a  useful  substitute,  especially  so  if  it 
contains  enough  detail  so  that  a  designer  can  reasonably  evaluate  the  actual  com- 
puter systems  that  embody  a  particular  design  alternative. 

Behind  the  goal  of  the  book  as  a  guide  for  the  practicing  computer  designer 
lies  the  feeling  that  the  field  of  computer  engineering  needs  to  develop  a  sense  of 
history  and  of  looking  to  the  past  for  guidance.  The  fantastic  advance  in  basic  logic 
technology — in  speed,  cost,  and  reliability —  makes  each  day  seem  an  absolutely 
new  one.  But,  of  course,  it  is  not.  Many  alternative  designs  have  been  tried  out  in 
past  systems,  in  ways  relevant  to  current  design.  Thus,  we  have  the  goal  of  saving 
some  of  the  past  in  a  form  accessible  to  the  future  needs  of  computer  design.  This 
goal  is  mixed  with  a  certain  archival  feeling.  Many  of  the  systems  in  this  book  have 
never  been  documented,  other  than  in  manuals  and  various  elementary  how-to 
programming  books. 


Preface 


A  final  goal  comes  from  our  feelings  as  computer  scientists  that  the  variety  of 
computer  systems  is  a  phenomena  worthy  of  study  in  its  own  right.  This  book  carries, 
therefore,  an  invitation  to  taxonomy — to  asking  how  to  classify  the  diversity  of 
forms  of  computer  systems  that  are  coming  into  existence.  Taxonomic  endeavors 
usually  take  place  in  a  field  of  natural  systems,  particularly  biological  systems.  It 
may  seem  strange  that  a  domain  of  artificial  systems  calls  for  taxonomic  activity. 
But  the  demand  for  empirical  classification  exists  whenever  there  is  a  population  of 
significant  size  and  rich  structure.  Rudimentary  classification  efforts  have  occurred 
for  many  populations  of  artifacts — for  ships,  for  aircraft,  for  houses.  This  book 
should  amply  confirm  that  computer  systems  are  complex  and  diverse  enough — 
and  undergoing  enough  continual  proliferation  and  evolution — to  command  sig- 
nificant taxonomic  endeavor. 

Enough  is  said  in  the  first  two  chapters  about  the  new  notations  introduced  in 
the  book,  so  that  nothing  substantive  need  be  added  here.  We  apologize  for  inflicting 
new  notation  on  the  reader.  We  feel  that  good  notations  are  really  quite  important 
for  the  aspects  of  computer  structure  described  in  this  book.  Much  would  be  gained 
by  the  whole  field  of  computers — by  users,  programmers,  engineers,  planners, 
buyers,  sellers,  manufacturers,  students,  and  scientists — if  relatively  uniform 
notations  came  into  common  use.  Although  we  have  no  illusions  about  the  perfec- 
tion of  the  notations  we  have  introduced,  we  would  be  most  happy  if  they  cause  a 
rise  in  concern  for  standard  notations  and  nomenclature. 

A  large  number  of  distinct  systems  are  described  in  substantial  detail.  We  have 
redescribed  many  of  the  systems  in  the  common  notation  introduced  in  the  book. 
The  accuracy  of  all  these  descriptions  is  a  major  problem.  Even  where  the  papers 
are  reproduced  from  the  literature,  this  problem  of  accuracy  remains — although 
then  it  is  not  ours  alone.  Even  though  we  have  taken  pains  to  obtain  accurate  in- 
formation on  the  systems  and  to  portray  them  faithfully  in  our  various  descriptions 
and  figures,  there  is  no  way  we  can  be  responsible  for  their  ultimate  accuracy.  The 
PMS  and  ISP  figures,  in  particular,  cannot  be  guaranteed  to  be  accurate  representa- 
tions of  the  systems  they  purport  to  describe.  Ultimately,  one  would  like  to  have 
simulation  languages  for  such  notations  and  to  verify  (up  to  the  usual  criteria  of  a 
debugged  program)  that  a  system  given  by,  say,  an  ISP  description,  simulates  the 
behavior  of  the  target  machine.  But  that  day  is  still  far  off. 

Our  most  fundamental  acknowledgment  is  to  the  contributors  to  this  volume, 
not  only  for  the  articles  they  have  written,  but  for  the  computers  they  have  designed 
and  built,  thereby  creating  a  population  of  fascinating  artifacts  worthy  of  study.  An 
additional  reason  for  reprinting  their  articles  rather  than  simply  describing  their 
computer  systems  is  the  importance  of  having  available  the  views  of  the  designers 
themselves  about  the  nature  of  their  systems. 

The  research  on  the  basic  ideas  underlying  the  notations  was  supported  by 
Advanced  Research  Projects  Agency  of  the  Office  of  the  Secretary  of  Defense 
(F  44620-67-C-0058)  and  is  monitored  by  the  Air  Force  Office  of  Scientific  Research. 

We  would  like  to  extend  an  acknowledgment  to  the  organizations  that  have 
produced  all  of  these  computers,  oftentimes  it  would  seem  in  defiance  of  the  laws 
of  economics.  Perhaps,  as  the  old  saw  has  it,  a  computer  manufacturer  is  simply  a 
computer's  way  of  breeding  another  computer.  This  might  account  for  the  tenacity 


shown  by  computer  manufacturers  in  spawning  the  vast  numbers  of  computer 
systems  that  provide  our  field  of  study.  Within  this  general  acknowledgment,  we 
would  like  to  extend  a  very  specific  one  to  all  the  people  in  these  organizations  who 
helped  make  information  available  to  us — the  manuals,  photographs,  dates,  etc., 
that  this  book  has  demanded  in  such  great  quantity. 

We  are  indebted  to  the  students  who  have  read  and  criticized  the  various  PMS 
and  ISP  figures:  Richard  Dove,  Wayne  Kohl.  Michael  Knudsen,  Paul  Mobus,  and 
Charles  '^fferkorn.  Ken  Fitzgerald  and  Anita  Jones  of  IBM  were  kind  enough  to 
read  the  introduction  to  the  IBM  System/360. 

Professor  David  L.  Parnas  initially  reviewed  the  text  and  contents,  thus  providing 
many  helpful  suggestions.  Our  other  colleagues,  especially  Professors  Angel  Jordan, 
Alan  Perils,  Herbert  Simon  and  Everard  M.  Williams  deserve  a  special  thanks  for 
their  patience  and  encouragement. 

Finally,  we  would  like  to  thank  those  who  were  a  part  of  the  machine  that  assembled 
the  book;  the  editors  of  McGraw-Hill;  Mrs.  Mary  Ross  who  assembled  the  bibliog- 
raphy, figures,  and  contributor  articles;  Mrs.  Mildred  Sisko  who  typed  the  PMS  and 
ISP  Appendix;  and  especially  Mrs.  Dorothy  Josephson  who  not  only  typed  nearly  all 
drafts  of  the  book,  but  also  the  final  PMS  figures,  and  ISP  Appendices. 


C.  Gordon  Bell 
Allen  S'ewell 


Acknowledgments 


R.  H.  Allmark  and  J.  R.  Lucking:  Design  of  an  Arithmetic  Unit  Incorporating 
a  Nesting  Store,  Proceedings  of  the  Internationa}  Federation  of  Informa- 
tion Processing  Congress  1962,  pp.  694-698,  North  Holland  Publishing  Co., 
Amsterdam,  Holland,  by  permission  from  American  Federation  of  Informa- 
tion Processing  Societies  (AFIPS),  Spartan  Books,  Washington,  D.C. 

R.  L.  Alonso,  H.  Blair-Smith,  and  A.  L.  Hopkins:  Some  Aspects  of  the  Logical 

Design  of  a  Control  Computer,  A  Case  Study,  Transactions  on  Electronic 
Computers,  vol.  EC-12,  no.  6,  pp.  687-697,  December,  196.3,  by  permission 
of  the  authors  and  the  Institute  of  Electrical  and  Electronics  Engineers 
(IEEE). 

James  P.  Andersott,  Samuel  A.  Hoffman,  Joseph  Shifman,  and  Robert  J. 
Williams:  D825 — A  Multiple  Computer  System  for  Command  and  Control, 
Proceedings  of  the  AFIPS  Fall  Joint  Computer  Conference,  vol.  22,  pp.  86-96, 
1962,  by  permission  from  AFIPS,  Spartan  Books,  Washington,  D.C.  The 
authors  acknowledge: 

The  authors  wish  to  acknowledge  the  outstanding  efforts  of  their  many 
colleagues  at  Burroughs  Laboratories  who  have  contributed  so  well 
and  in  so  many  ways  to  all  stages  of  D82.5  design,  development,  fabri- 
cation, and  programming.  It  would  be  impossible  to  cite  all  of  these 
efforts.  The  authors  also  wish  to  acknowledge  the  contributions  of 
Mr.  William  B.  Slack  and  Mr.  William  W.  Carver,  also  of  Burroughs 
Laboratories.  Mr.  Slack  has  been  closely  associated  with  the  D825  from 
its  original  conception  to  its  implementation  in  hardware  and  software. 
Mr.  Carver  made  important  contributions  to  the  writing  and  editing 
of  this  paper. 

George  H.  Barnes,  Richard  M.  Brown,  Maso  Kato,  David  J.  Kuck,  Daniel  L. 
Slotnick,  and  Richard  A.  Stokes:  The  ILLIAC  IV  Computer,  Transactions 
on  Computers,  vol.  C-17,  no.  8,  pp.  746-757,  August  1968,  by  permission  of 
the  authors  and  the  IEEE.  The  authors  acknowledge: 

This  work  was  supported  in  part  by  the  Department  of  Computer 
Science,  University  of  Illinois,  Urbana,  Illinois,  and  in  part  by  the  Ad- 
vanced Besearch  Projects  Agency  as  administered  by  the  Rome  Air 
Development  Center,  Griffiss  Air  Force  Base,  Bome,  New  York,  under 
Contract  USAF  30  (602)4144. 

The  authors  are  pleased  to  acknowledge  their  indebtedness  to  the 
group  at  the  Westinghouse  Electric  Corporation  that  initiated  the 
parallel  computer  effort.  The  work  of  W.  C.  Borck,  A.  B.  Carroll, 


J.  R.  Hudson,  W.  H.  Leonard,  R.  C.  McReynolds,  and  G.  Shapiro  formed 
the  basis  for  the  subsequent  efforts.  Of  particular  importance  is  the 
work  of  J.  G.  Gregory  in  tuning  the  conceptual  design  to  the  real 
world  of  technology. 

Theodore  R.  Bashkow,  Azra  Sasson,  and  Arnold  Kronfeld:  System  Design 
of  a  FORTRAN  Machine,  Transactions  on  Electronic  Computers,  vol.  EC-16, 
no.  4,  pp.  485-499,  August  1967,  by  permission  of  the  authors  and  the  IEEE. 
The  authors  acknowledge: 

This  research  is  supported  by  the  Air  Force  Office  of  Scientific  Besearch 
Contract  AFI9(628)— 2798. 

G.  A.  Blaauw  and  F.  P.  Brooks,  Jr.:  The  Structure  of  System/360,  Part  I — 
Outline  of  the  Logical  Structure,  IBM  Systems  Journal,  vol.  3,  no.  2,  pp.  119- 
135,  1964,  by  permission  from  the  IBM  Systems  Journal. 

Erich  Bloch:  The  Engineering  Design  of  the  Stretch  Computer,  Proceedings 
of  the  Eastern  Joint  Computer  Conference,  1959,  pp.  48-58,  by  permission 
of  the  author  and  the  Institute  of  Electrical  and  Electronics  Engineers. 
The  author  acknowledges: 

The  efforts  and  contributions  of  many  people  have  gone  into  the 
engineering  design  of  the  Stretch  computer.  To  mention  all  would  be 
impossible.  However,  the  following  individuals  and  their  groups  were 
responsible  for  the  units  indicated;  Mr.  R.  T.  Blosk  for  the  Instruc- 
tion Unit,  Mr.  J.  F.  Dirac  for  the  Look-ahead  Units,  Messrs.  J.  A.  Hipp 
and  O.  L.  MacSorley  for  the  Arithmetic  LInits,  and  Mr.  L.  O.  Ulfsparre 
for  the  Memory  Bus.  The  Systems  Development  was  under  the  guidance 
of  Messrs.  S.  W.  Dunwell  and  R.  E.  Merwin. 

Arthur  W.  Btirks,  Herman  H.  Goldstine,  and  John  von  Neumann:  Pre- 
liminary Discussion  of  the  Logical  Design  of  an  Electronic  Computing 
Instrument,  "Collected  Works  of  John  von  Neumann,"  vol.  V,  pp.  34-79, 
General  Editor:  A.  H.  Taub,  Macmillan  Company,  by  permission  from 
Pergamon  Press,  New  York,  1963.  The  authors  acknowledge: 

This  report  has  been  prepared  in  accordance  with  the  terms  of  Con- 
tract W-36-O34-0RD-7481  between  the  Research  and  Development 
Service,  Ordnance  Department,  LI.S.  Army  and  the  Institute  for  Ad- 
vanced Study. 

The  authors  wish  to  express  their  thanks  to  Dr.  John  Tukey,  of  Princeton 
University,  for  many  valuable  discussions  and  suggestions. 

John  W.  CarrUI:  UNIVAC  Scientific  (II03A)  Insti-uction  Logic,  pp.  77-83; 
IBM  650  Instruction  Logic,  pp.  93-98;  Instruction  Logic  of  the  Soviet 


Acknowledgments  xi 


Strela  (Arrow),  pp.  11 1-115;  Instruction  Lo^ic  of  the  MIDAC,  pp.  1 1.5-121, 
chap.  2,  Programming  and  Coding,  "Handbook  of  Automation,  Computa- 
tion, and  Control,  '  vol.  2,  edited  by  Eugene  M.  Grabbe,  .Simon  Ramo,  and 
Dean  Wooldridge,  Copyright  ©  19.59  John  W  iley  &  Sons,  Inc.,  New  York, 
reprinted  by  permission. 

/.  Presper  Eckert,  Jr.,  James  R.  Weiner,  H.  Frazer  Welnh,  and  Herbert  F. 
Mitchell:  The  l'NIV.\C  System.  American  Institulc  of  Electrical  Engineers- 
Institute  of  Radio  Eni!,inccrs  Conference,  pp.  6-16,  December.  1951,  l)y 
permission  of  the  authors  and  the  IEEE.  The  authors  acknowledge: 

The  UNIVAC  System  has  been  an  over-all  company  project  and 
hundreds  of  people  have  participated.  It  is.  therefore,  difficult  to 
acknowledge  the  contributions  of  individuals.  However,  special  men- 
tion must  be  made  of  the  contributions  of  Mr.  H.  Lukoff,  .Mr.  E.  I. 
Blunienthal,  Mr.  L.  D.  Wilson,  and  Mr.  ].  D.  Chapline,  Jr.  To  the 
Census  Bureau  a  great  debt  of  gratitu<!e  is  owed  for  their  continuous 
support  of  the  project. 

W.  S.  Elliott,  C.  E.  Ou  en,  C.  H.  Dei  unald.  and  B.  C.  Maudsley:  The  Design 
Philosophy  of  Pegasus,  .\  Quantity-production  C;omputer,  Proceedings  of 
the  hutitution  of  Electrical  Eng,ineers,  London,  Pt.  B,  vol.  10.3,  Supple- 
ment 2,  pp.  188-196,  1956,  by  permission  of  the  Institution  of  Electrical 
Engineers.  The  authors  acknowledge: 

The  authors  would  like  to  acknowledge  the  contributions  that  Mr. 
C.  Strachey  and  Dr.  D.  B.  Gillies,  of  the  National  Research  Development 
Corporation,  and  Dr.  J.  M.  Bennett  and  Mr.  T.  G.  H.  Braunholtz,  of 
Ferranti,  Ltd.,  made  to  the  logical  design  of  Pegasus:  particular  thanks 
are  due  to  Mr.  C.  Strachey  for  originating  the  order  code. 

They  also  thank  Ferranti,  Ltd..  and  the  -National  Research  Develop- 
ment Corporation  for  permission  to  publish  the  paper. 

R.  R.  Everett:  The  W  hirlwind  I  Computer,  Reiiew  of  Electronic  Digital 
Computers.  Joint  Computers  American  Institute  of  Electrical  Engineers- 
Institute  of  Radio  Engineers  Conference,  pp.  70-74.  Febniary.  1952,  by 
permission  of  the  author  and  the  IEEE. 

Thomas  W.  Kampe:  The  Design  of  a  General-purpose  Microprogram- 
controlled  Computer  with  Elementary  Structure,  Institute  of  Radio 
Engineers,  Transactions  on  Electronic  Computers,  vol.  EC-9,  no.  2,  pp.  208- 
213,  June,  1960,  by  permission  of  the  author  and  the  IEEE.  The  author 
acknowledges: 

The  author  wishes  to  thank  his  co-designers,  R.  Compton  and  T.  Hayata, 
for  their  assistance  during  the  design  of  the  SD-2  computer  and  for 
their  suggestions  on  this  paper. 

T.  Kilbum,  D.  B.  G.  Edwards,  M.  J.  Lanigan,  and  F.  H.  Sumner:  One- 
level  Storage  System,  Institute  of  Radio  Engineers  Transactions,  vol.  EC-11, 


no.  2,  pp.  223-23.5,  .-Vpril,  1962,  by  permission  of  the  authors  and  the  LEEE. 
The  authors  acknowledge: 

The  authors  gratefully  acknowledge  the  contributions  made  to  this 
work  by  all  members  of  the  .\tlas  computer  team  at  lioth  Manchester 
University  and  Ferranti  Ltd, 

B.  W.  Lampson,  \V.  W.  Lichtenberger,  and  M.  W.  Pirtle:  .A  User  Machine 
in  a  Time-sharing  System,  Proceedings  of  the  Imtilute  of  Electrical  and 
Electronics  Engineers,  vol.  54,  no.  12.  pp.  1766-1774.  December.  1966, 
by  permission  of  the  authors  and  the  IEEE.  The  authors  acknowledge: 

The  work  for  this  paper  was  supported  in  part  by  the  Advanced  Re- 
search Projects  .\gency.  Department  of  Defense,  Contract  SD-1^5. 

The  software  portion  of  the  system  was  designed  and  written  in  part 
by  L.  P.  Deulsch,  who  is  entitled  to  equal  credit  with  the  authors  for 
the  ideas  in  this  paper.  L.  Barnes  also  contributed  significantly  to  the 
final  result. 

.\/.  Lehman:  A  Survey  of  Problems  and  Preliminary  Results  Concerning 
Parallel  Processing  and  Parallel  Processors,  Proceedings  of  the  Institute  of 
Electrical  and  Electronics  Engineers,  vol.  54,  no.  12,  pp.  1889-1901, 
December,  1966,  by  permission  of  the  author  and  the  IEEE.  The  author 
acknowledges: 

This  paper  reports  on  a  group  activity  in  which  each  inchvidual  mem- 
ber had  his  own  specific  assignments  and  in  addition  participated  in 
regular  discussions  on  all  aspects  of  the  project.  Credit  is  therefore 
due  to  all  members  of  the  group  which,  during  the  period  covered  by 
the  contents  of  this  paper,  included  G.  C.  Driscoll,  J.  M.  Lee,  .\.  P. 
Mullery,  J.  L.  Rosenfeld,  H.  P.  Schlaeppi,  and  M.  Weitzman.  I  should 
also  like  to  express  my  sincere  thanks  to  Dr.  H.  .\.  Ernst  for  the  con- 
structive criticism,  advice,  and  encouragement  offered  during  prepara- 
tion of  this  paper.  My  sincere  thanks  are  also  due  to  members  of  the 
Graphics  and  Design  Department  at  the  Thomas  J.  W  atson  Research 
Center,  and  in  particular  to  G.  Massi  and  .Mrs.  .M.  J.  LaMarre  for  their 
preparation  of  the  charts  and  figures.  Last,  my  thanks  to  Mrs.  J.  Galto 
for  her  infinite  patience  in  the  repeated  retypings  of  the  manuscript. 

A.  L.  Leiner,  W.  A.  Sotz.  J.  L.  Smith,  and  A.  Weinberger:  PILOT,  The 
NBS  Multicomputer  System,  Proceedings  of  the  Eastern  Joint  Computer 
Conference.  195S,  pp.  71-75,  by  permission  of  the  authors  and  the  IEEE. 
The  authors  acknowledge: 

The  authors  wish  to  acknowledge  the  valuable  contributions  of  their 
colleagues  H.  Loberman  and  W.  Youden,  who  helped  to  develop  the 
logical  design  and  programming  procedures  for  this  system. 

William  Lonergan  and  Paul  King:  Design  of  the  B  50(X)  Svstem,  Datama- 
tion, vol.  7,  no.  5,  pp.  28-32,  May,  1961,  by  permission  of.  published  and 
Copyrighted  ©  1961  by  F.  D.  Thompson  Publications,  Inc.,  Greenwich, 
Conn. 


Acknowledgments 


Richard  E.  Monnier,  Thomas  E.  Osborne,  and  David  S.  Cochran:  The 
HP  Model  9100A  Computing  Calculator.  This  chapter  is  a  compilation  of 
three  articles:  A  New  Electronic  Calculator  with  Computerlike  Capabili- 
ties, by  Richard  E.  Monnier,  pp.  3-9;  Hardware  Design  of  the  Model 
9100A  Calculator,  by  Thomas  E.  Osborne,  pp.  10-13;  and  Internal 
Programming  of  the  9100A  Calculator,  by  David  S.  Cochran,  pp.  14-16, 
which  appeared  in  the  Hewlett-Packard  Journal,  volume  20,  no.  1,  Septem- 
ber, 1968,  by  permission  of  the  Hewlett-Packard  Journal. 

R.  E.  Porter:  The  RW-400— A  New  Polymorphic  Data  System,  Data- 
mation, vol.  6,  no.  1,  pp.  8-14,  January/February,  1960,  by  permission  of, 
published  and  Copyrighted  ©  1960  by  F.  D.  Thompson  Publications,  Inc., 
Greenwich,  Conn. 

/.  C.  Shaw,  A.  Newell,  H.  A.  Simon,  and  T.  O.  Ellis:  A  Command  Struc- 
ture for  Complex  Information  Processing,  Western  Joint  Computer  Con- 
ference 1958,  by  permission  of  the  authors  and  the  IEEE. 

W.  Y.  Stevens:  The  Structure  of  System/360,  Part  II — System  Implementa- 
tions, IBM  Sijstems  Journal,  vol.  3,  no.  2,  pp.  136-143,  1964,  by  permission 
from  the  IBM  Systein.s  Journal. 

James  E.  Thornton:  Parallel  Operation  in  the  Control  Data  6600,  Proceed- 
ings of  the  AFIPS  Fall  Joint  Computer  Conference,  Pt.  II,  vol.  26,  pp.  33-40, 
1964,  by  permission  from  AFIPS,  Spartan  Books,  Washington,  D.C. 

W.  L.  van  der  Poel:  ZEBRA,  A  Simple  Binary  Computer,  Proceedings  of 
ati  International  Conference  on  Information  Processing,  Paris,  UNESCO 
House,  June,  1959,  pp.  361-365,  by  permission  from  AFIPS,  Spartan  Books, 
Washington,  D.C. 

Helmut  Weber:  A  Microprogrammed  Implementation  of  EULER  on  IBM 
System/360  Model  30,  Communicatiotis  of  the  Association  for  Comptiting 
Machinerij,  vol.  10,  no.  9,  pp.  549-558,  September,  1967,  Copyright  © 
1967  Association  for  Computing  Machinery,  Inc.,  by  permission  of  the 
author  and  the  Association  for  Computing  Machinery,  Inc.  The  author 
acknowledges: 

I  wish  to  thank  Jack  Carman,  who  wrote  the  I/O  Control  Program  and 
the  Operating  System  linkage  for  the  EULER  system  and  Miss  Sheila 
Morrison  who  helped  prepare  the  figures.  I  am  also  grateful  for  the 
valuable  criticism  offered  by  the  referee.  W.  C.  McGee,  as  well  as  by 
Professor  N.  Wirth  and  E.  Satterthwaite. 


/.  H.  Wilkinson:  The  Pilot  ACE,  by  permission  from  Automatic  Digital 
Computation,  pp.  5-14,  National  Physical  Laboratory,  Teddington, 
England,  March  25-28,  19.53. 

Af.  V.  Wilkes  and  J.  B.  Stringer:  Micro-programming  and  the  Design  of 
the  Control  Circuits  in  an  Electronic  Digital  Computer,  Proceedings  of 
the  Cambridge  Philosophical  Society,  Pt.  2,  vol.  49,  pp.  230-238,  April, 
1953,  by  permission  of  the  authors  and  the  Cambridge  Philosophical  Society, 
Cambridge,  England.  The  authors  acknowledge: 

The  authors  wish  to  express  their  thanks  to  Mr.  A.  L.  Freedman  and 
Mr.  W.  Renwick  for  assisting  them  in  clarifying  a  number  of  points, 
and  to  Professor  D.  R.  Hartree,  F.R.S.,  for  his  generous  help  with  the 
preparation  of  the  paper. 

Joseph  E.  Wirsching:  NOVA:  A  List-oriented  Computer,  Datamation, 
vol.  12,  no.  12,  pp.  41-43,  December,  1966,  by  permission  of,  published 
and  Copyrighted  ©  1966  by  F.  D.  Thompson  Publications,  Inc.,  Green- 
wich, Conn.  The  author  acknowledges: 

This  work  was  performed  under  the  auspices  of  the  U.S.  Atomic 
Energy  Commission. 

Several  organizations  have  contributed  to  the  writing  and  production  of 
this  book  by  giving  us  permission  to  use  material  from  their  publications. 
In  many  cases  they  have  also  supplied  us  with  original  copies.  We  have 
credited  their  text,  tables,  pictures,  and  diagrams  when  they  are  used. 
This  cooperation  has  been  invaluable.  The  specific  organizations  are: 

Adams's  Associates:  Computer  Characteristics  Quarterly.  (Adams,  1966-1968) 

Computers  and  Automation  magazine 

Control  Data  Corporation,  8100  34th  Avenue  South,  Minneapolis, 
Minnesota 

Datamation  magazine 

Digital  Equipment  Corporation,  146  Main  Street,  Maynard,  Massachusetts 

Hewlett-Packard  Company,  1501  Page  Mill  Road,  Paloj  California 

International  Business  Machines  Corporation,  White  Plains  and  Pough- 
keepsie.  New  York 

Massachusetts  Institute  of  Technology,  Cambridge,  Massachusetts 
National  Science  Foimdation 

Olivetti  Underwood  Corporation,  1  Park  .Avenue,  New  York,  New  York 
Scientific  Data  Systems,  1649  Seventeenth  Street,  Santa  Monica,  California 


Contributors 


R.  H.  Allmark 

W.  S.  Elliott 

W.  W.  Lichtenberger 

J.  L.  Smith 

R.  L.  Alonso 

T.  0.  Ellis 

William  Lonergan 

W.  Y.  Stevens 

James  P.  Anderson 

R.  R.  Everett 

J.  R.  Lucking 

Richard  A.  Stokes 

Theodore  R.  Bashkow 

Herman  H.  Goldstine 

B.  G.  Maudsley 

J.  B.  Stringer 

George  H.  Barnes 

Samuel  A.  Hoffman 

Herbert  F.  Mitchell 

F.  H.  Sumner 

G.  A.  Blaauw 

A.  L.  Hopkins 

Richard  E.  Monnier 

James  E.  Thornton 

H.  Blair  Smith 

Thomas  W.  Kampe 

W.  A.  Notz 

W.  L.  van  der  Poel 

Erich  Bloch 

Maso  Kato 

Thomas  E.  Osborne 

John  von  Neumann 

F.  P.  Brooks,  Jr.  — 

T.  Kilburn 

C.  E.  Owen 

Helmut  Weber 

Richard  M.  Brown 

Paul  King 

M.  W.  Pirtle 

A.  Weinberger 

Arthur  W.  Burks 

David  J.  Kuck 

R,  E.  Porter 

James  R.  Werner 

John  W.  Carr  III 

Arnold  Kronfeld 

Azra  Sasson 

H.  Frazer  Welsh 

David  S.  Cochran 

B.  W.  Lampson 

J.  C.  Shaw 

M.  V.  Wilkes 

C.  H.  Devonald 

M.  J.  Lanigan 

Joseph  Shifman 

J.  H.  Wilkinson 

D.  B.  G.  Edwards 

A.  L.  Leiner 

H.  A.  Simon 

Robert  J.  Williams 

J.  Presper  Eckert,  Jr. 

M.  Lehman 

Daniel  L.  Slotnick 

Joseph  E.  Wirsching 

0^ 


L 


V 


xiii 


/^^•^  Contents^ 

Preface  C  Acknowledgments  a 

Contributors  xiii 


Part  1    The  Structure  of  Computers 

Chapter  1      Introduction  3  Chapter  3      The  Computer  Space  37 

Chapter  2      The    PMS    and    ISP  Descriptive 

Systems  15 


Part  2   The  Instruction-set  Processor:  Main  line  computers 


Section  1    Processors  with  One  Address  per  Instruction  89 


Chapter  4 


Chapter  5 
Chapter  6 

Chapter  33 
Chapter  7 


Chapter  42 


Preliminary  Discussion  of  the  Logi- 
cal Design  of  an  Electronic  Com- 
puting Instnuiient — Arthur  VV. 
Burks,  Herman  H.  Goldstine,  and 
John  von  Neumann 
The  DEC  PDP-8 

The    Whirlwind    I    Computer — 
R.  R.  Everett 
The  IBM  1800 

Some  Aspects  ot  the  Logical  Design 
of  a  Control  Computer:  A  Case 
Study— R.  L.  Alonso,  H.  Blair-Smith, 
and  A.  L.  Hopkins 
The  SDS  910-9300  Series 


92 
120 

137 


146 


Chapter  16    The  LGP-30  and  LGP-21 
Chapter  17    IBM  650  Instruction  Lo^ic — John  W. 
Curr  III 

Chapter  41     The  IBM  7094  I,  II 

Chapter  8  The  UNTVAC  System — J.  Presper 
Eckert.  Jr.,  James  B.  W'einer, 
H.  Frazer  Welsh,  and  Herbert  F. 
Mitchell 

Chapter  23    One-level      Storage       System — T. 

Kilhurn.  D.  B.  G.  Edivards.  M.  J. 

Laniaan.  and  F.  H.  Swnmer 
Chapter  34    The    Engineering    Design    of  the 

Stretch  Computer — Erich  Bloch 


157 


Section  2    Processors  with  a  General-register  State 


Chapter  9  The  Design  Philosophy  of  Pegasus, 
A  Quantity-production  Computer — 
W.  S.  EUiott,  C.  E.  Owen,  C.  H. 
Devonald,  and  B.  G.  Maudsley  171 

Chapter  43  The  Structure  of  System/360, 
Part    I — Outline    of    the  Logical 


Structure — G.  A.  Bleuiuw  and  F.  P. 

Brooks,  Jr.  ~ 
Chapter  10    An  8-bit-character  Computer  184 
Chapter  39    Parallel  Operation  in  the  Control 

Data  6600 — James  E.  Thornton 


^This  is  a  "virtual'  contents,  which  means  that  because  many  of  the  computers  are  relevant  to  more  than  one  part  and  section,  we  have  used  italic 
type  to  indicate  a  nonsequential  mapping  for  computers  placed  out  of  "physical"  order.  The  reader  might  read  (reference)  the  book  according  to  the 
virtual  order. 

XV 


xvi  Contents 


Part  3    The  Instruction-set  Processor  Level:  Variations  in  the  Processor 


Section  1    Processors  with  Greater  than  One  Address  per  Instruction  191 

Chapter  11    The  Pilot  ACE— J.  H.  Wilkinson        193  Chapter  14    Instruction  Logic  of  the  MIDAC— 

Chapter  12    ZEBRA,  A  Simple  Binary  Computer  John  W.  Carr  111  209 

— W.  L.  van  der  Poel  200  Chapter  15    Instruction    Logic    of   the  Soviet 

Chapter  13    UNIVAC  Scientific  (1103A)  Instruc-  Strela  (Arrow)— John  W.  Carr  III  213 

tion  Logic— John  W.  Carr  III  205 
Chapter  38    The  RW-400:  A  New  Polymorphic 

Data  System — R.  £.  Porter 


Section  2    Processors  Constrained  by  a  Cyclic,  Primary  Memory  216 


Chapter  19    The  OLIVETTI  Programma  lOI  Desk 
Calculator 

Chapter  12    ZEBRA,  A  Simple  Binary  Computer 

—  W.  L.  van  der  Poel 
Chapter  16    The  LGF-30  and  LGP-21  217 
Chapter  11     Tlie  Pilot  ACE—}.  H.  Wilkinson 
Chapter  8      The    UNIVAC  System— J.  Presper 

Eckert,    Jr.,    James    R.  Weiner, 

H.  Frazer  Welsh,  and  Herbert  F. 

Mitchell 


Chapter  9  The  Design  Philosophy  of  Pegasus,  A 
Quantity-production  Computer — 
W.  S.  Elliott,  C.  E.  Owen,  C.  H. 
Devonald,  and  B.  G.  Maudsley 

Chapter  17  IBM  650  Instruction  Logic- 
John  W.  Carr  111  220 

Chapter  26  NOVA:  A  List-oriented  Computer — 
Joseph  E.  Wirsching 


Section  3    Processors  for  Varlable-length-string  Data  224 


Chapter  18    The  IBM  1401 


225 


Chapter  10    An  8-bit-character  Computer 


Section  4    Desk  Calculator  Computers:  Keyboard  Programmable  Processors  with  Small  Memories  235 


Chapter  19    The   OLIVETTI   Programma  101 

Desk  Calculator  237 
Chapter  20    The  HP  Model  9100A  Computing 


Calculator — Richard  E.  Monnier, 
Thomas  E.  Osborne,  and  David  S. 
Cochran 


243 


Section  5    Processors  with  Stack  Memories  (Zero  Addresses  per  Instruction)  257 


Chapter  21  Design  of  an  Arithmetic  Unit  In- 
corporating a  Nesting  Store — R.  H. 
Allmark  and  J.  R.  Lucking 

Chapter  22  Design  of  the  B  5000  System- 
William  Lonergan  and  Paul  King 

Chapter  36  D825 — A  Multiple-computer  System 
for  Command  and  Control — James  P. 
Anderson,    Samuel    A.  Hoffman, 


262 


267 


Joseph  Shifman,  and  Robert  J. 
Williams 

Chapter  30  A  Command  Structure  for  Complex 
Information  Processing — /.  C.  Shaw, 
A.  Newell,  H.  A.  Simon,  T.  O.  Ellis 

Chapter  32  Microprogrammed  Implementation  of 
EULER  on  IBM  System/ 360  Model 
30— Helmut  Weber 


Contents  xvii 


Section  6    Processors  with  jVlultiprogramming  Ability  274 


Chapter  23  One-level  Storage  System — T.  Kil- 
bum,  D.  B.  G.  Edwards,  M.  J. 
Lanigan,  and  F.  H.  Simmer 

Chapter^    Design  of  the  B  5000  System— 


276 


William  Lonergan  and  Paul  King 
Chapter  24    A  User  Machine  in  a  Time-sharing 
System— B.  W.  Lampson,  W.  W. 
Lichtenberger,  and  M.  VV.  Pirtle  291 


Part  4   The  Instruction-set  Processor  Level:  Special-function  Processors 


Section  1    Processors  to  Control  Terminals  and  Secondary  Memories  (Input-output  Processors)  303 


Chapter  41     The  IBM  7094  I.  II 

Chapter  43  The  Structure  of  Stjstem/360, 
Part  I — Outline  of  the  Logical 
Structure/C.  A.  Bhiauu  and  F.  P. 
Brooks,  Jr. 


Chapter  33     The  IBM  ISOO 

Chapter  25    The  DEC  338  Display  Processor  305 


Section  2    Processors  for  Array  Data  315 


Chapter  26  NOVA:  A  List-oriented  Computer — 
Joseph  E.  Wirsching 

Chapter  27  The  ILLIAC  IV  Computer- 
George    H.    Barnes,    Richard  M. 


376 


Brown,  Maso  Kato,  David  J.  Kuck, 
Daniel  L.  Slotnick,  and  Richard  E. 
Stokes  320 


Section  3    Processors  Defined  by  a  Microprogram  334 

Chapter  28  Microprogramming  and  the  Design 
of  the  Control  Circuits  in  an  Elec- 
tronic Computer — M.  V.  Wilkes  and 
J.  B.  Stringer  335 
The  Design  of  a  General-purpose 
Microprogram-controlled  Computer 
with  Elementary  Structure — 
Thomas  W.  Kanipe  341 


Chapter  20  The  HP  Model  9100A  Computing 
Calculator — Richard  E.  Monnier, 
Thomas  E.  Osborne,  and  David  S. 
Cochran 

Chapter  32  A  Microprogrammed  Implementation 
of  EVLER  on  IBM  Sijstem/360 
Model  30— Helmut  Weber 


Chapter  29 


Section  4    Processors  Based  on  a  Programming  Language  348 


Chapter  30  A  Command  Structure  for  Complex 
Information  Processing — J.  C.  Shaw, 
A.  Newell,  H.  A.  Simon,  andT.  O.  Ellis  349 

Chapter  31  System  Design  of  a  FORTRAN 
Machine — Theodore    R.  Bashkow, 


Azra  Sasson,  and  Arnold  Kronfeld  363 
Chapter  32    A  Microprogrammed  Implementa- 
tion of  EULER  on  IBM  System/360 
Model  30— Helmut  \\'eber  382 


xviii  Contents 


Part  5    The  PMS  Level 


Section  1    Computers  with  One  Central  Processor  395 


Chapter  6      The  Whirlwind  I  Computer — R.  R. 
Everett 


Chapter  42    The  SDS  910-9300  Series 


Section  2    Computers  with  One  Central  Processor  and  Multiple  Input/Output  Processors  396 


Chapter  5      The  DEC  PDP-8 

Chapter  33    The  IBM  1800 

Chapter  41     The  IBM  7094  I,  11 

Chapter  43  The  Structure  of  System/360, 
Part  I — Outline  of  the  Logical 
Structure — G.  A.  Blaauw  and  F.  P. 
Brooks,  Jr. 


399 


Chapter  34  The  Engineering  Design  of  the 
Stretch    Computer — Erich  Bloch 

Chapter  35  PILOT,  The  NBS  Multicomputer 
System — A.  L.  Leiner,  W.  A.  Notz, 
J.  L.  Smith,  and  A.  Weinberger 


421 


440 


Section  3    Computers  for  Multiprocessing  and  Parallel  Processing  446 


Chapter  36  D825 — A  Multiple-computer  System 
for  Command  and  Control — 
James  P.  Anderson,  Samuel  A. 
Hoffman,  Joseph  Shifman,  and 
Robert  J.  Williams 

Chapter  22    Design   of  the  B  5000  System  — 


447 


William  Lonergan  and  Paul  King 
Chapter  37  A  Survey  of  Problems  and  Prelimi- 
nary Results  Concerning  Parallel 
Processing  and  Parallel  Processors — 
M.  Lehman 


456 


Section  4    Network  Computers  and  Computer  Networks  470 


Chapter  38    The  RW-400:  A  New  Polymorphic 

Data  System— R.  E.  Porter  477 
Chapter  39    Parallel  Operation  in  the  Control 


Data  6600— James  E.  Thornton  489 
Chapter  40    Computer  Network  Examples  504 


Part  6    Computer  Families 


Section  1    The  IBM  701-7094  II  Sequence,  a  Family  by  Evolution  515 


Chapter  41    The  IBM  7094  I,  II  517 


Section  2    The  SDS  910-9300  Series,  a  Planned  Family  542 


Chapter  42    The  SDS  910-9300  Series  543 


Section  3    The  IBM  System/360— A  Series  of  Planned  Machines  Which  Span  a  Wide  Performance  Range  561 

Chapter  43    The     Structure     of    System/360,  Chapter  44    The     Structure  of  System/360, 

Part    I — Outline    of   the    Logical  Part   II — System  Implementations 

Structure — G.  A.  Blaauw  and  F.  P.  — W.  Y.  Stevens  602 

Brooks,  Jr.  588 


Contents  xix 


Appendix  PMS  and  ISP  Notations  607 


General  Conventions  607 


1  Basic  Semantics 

608 

8  Attributes 

612 

2  Metanotation 

808 

9  Null    Symbol    and  Optional 

Ex- 

3  Basic  Syntax 

609 

pression 

613 

4  Commands:  Assignments,  Abbrevia- 

10 Names 

613 

tion,  Variables,  Forms 

809 

11  Numbers 

614 

5  Indefinite  Expressions 

610 

12  Quantities,  Dimensions,  and 

Units 

615 

6  Lists  and  Sets 

811 

13  Boolean  and  Relations 

615 

7  Definite  Expressions 

811 

PMS  Conventions  615 

1  Dimensions 

616 

7  Switch  (S) 

623 

2  General  Units 

816 

8  Control  (K) 

624 

3  Information  Units 

616 

9  Transducer  (T) 

625 

4  C^omponent 

617 

10  Data-operations  (D) 

626 

5  Link  (L) 

619 

11  Processor  (P) 

626 

6  Memory  (M) 

620 

12  C:omputer  (C) 

628 

ISP  Conventions  628 

I  Data-tvpes 

629 

3  Operations 

632 

2  Instruction 

631 

4  Processors 

635 

Bililiojiniplui 

638 

Xante  Index 

653 

Machine  and  Oraaniztition  Index 

656 

Subject  Index 

661 

Part  1 

The  structure  of  computers 


Chapter  1 


This  book  presents  many  examples  of  computer  systems.  It  presents 
them  in  enough  detail  so  that  meaningful  engineering  study  and 
analysis  are  possible.  Most  of  these  e.xamples  are  presented  by 
using  the  original  descriptions  of  them  in  the  technical  literature. 
Others  have  been  redescrihed  t)y  us,  especially  where  the  original 
descriptions  existed  only  in  technical  manuals.  In  both  cases  there 
are  considerable  discussion  and  analysis  of  the  computer  struc- 
tures: what  problems  they  were  intended  to  solve,  what  solutions 
were  adopted,  and  how  these  solutions  have  fared.  Yet  the  em- 
phasis has  remained  on  detailed  descriptions  precise  enough  so 
that  the  systems  themselves  are  available  for  independent  study. 

Why  should  one  want  to  produce  such  a  book?  Collections  of 
reprintings  from  the  technical  literature  are  common  in  many 
science  and  engineering  fields,  e.g.,  "Programming  Systems  and 
Languages"  [Rosen,  1967].  We  have  departed  from  this  tradi- 
tional exercise  in  two  ways,  both  of  which  seem  important  to  us. 
First,  we  have  presented  substantial  amounts  of  detail;  in  effect, 
block  diagrams  of  computer  stnictures  and  the  equivalents  of 
programming  manuals.  These  constitute  neither  good  reading  nor 
a  way  of  communicating  the  "essential  ideas"  in  the  field.  Second, 
we  have  introduced  a  system  of  notation  and  have  used  it  not  only 
in  the  parts  we  ourselves  have  written  but  also  to  provide  addi- 
tional (sometimes  redundant)  descriptions  of  computer  systems  in 
the  reprinted  articles.  Why  should  there  be  a  book  like  this?  The 
reasons  are  several  and  require  some  background  discussion. 


Computer  systems 

Computer  systems  are  one  example  of  man's  more  complex  arti- 
ficial systems.'  They  have  existed  as  successful  engineering  prod- 
ucts long  enough  to  undergo  radical  evolution  and  to  give  rise 
to  a  number  of  basic,  unique  technologies.  They  are  sufficiently 
complex  that  they  have  given  rise  to  a  science,  that  is,  to  a  con- 
tinuing, institutionalized  endeavor  to  understand  what  sort  of  beast 
has  been  brought  forth.-  Our  fundamental  interest  is  in  the  devel- 

'We  need  not  argue  that  the)'  are  his  most  complex  system.  That  view- 
is  myopic.  Setting  aside  quasi-natural  systems,  such  as  cities  and  economies, 
it  is  still  the  case  that  a  modem  aircraft  carrier  is  more  complex  than  a 
modem  computer  bv  any  reasonable  measure. 

-Here  uniqueness  can  be  claimed,  perhaps,  since  few  other  artifactual 
systems  (again,  excluding  the  quasi-natural  ones  I  provide  new  phenomena 
that  require  sustained  scientific  investigation  to  understand  them.  There 
certainly  is  no  science  of  aircraft  carriers.  But  there  is  a  computer  science. 


opment  of  this  science  and  technology  of  computers  (one  of  us 
also  likes  to  build  computers).  To  understand  why  this  particular 
book  seems  to  us  to  be  the  right  way  to  push  this  development 
at  this  particular  time  requires  characterizing  the  current  state 
of  computer-systems  technology. 

.\  computer  system  is  complex  in  several  ways.  Figure  1  shows 
the  most  important.  There  are  at  least  four  levels  of  system  descrip- 
tion, possibly  five,  that  can  be  used  for  a  computer.  These  are  not 
alternative  descriptions  in  the  sense  that  anything  said  one  way 
can  be  said  another.  On  the  contrarv',  each  level  arises  from  ab- 
straction of  the  levels  below  it.  Each  does  a  job  that  the  lower 
levels  could  not  perform  because  of  the  unnecessary  detail  they 
would  be  forced  to  carry  around. 

A  system  (at  any  level)  is  characterized  by  a  set  of  components, 
of  which  certain  properties  are  posited,  and  a  set  of  ways  of  com- 
bining components  to  produce  systems.  When  formalized  appro- 
priately, the  behavior  of  the  systems  is  determined  by  the  behavior 
of  its  components  and  the  specific  modes  of  combination  used. 


structures;  Network/A',  computer/C 

Componer>ts:  Processors//',  memories /A/, 
swiTches/5,  controls /A*,  transducers  /  7; 
doto  operators links /Z. 


Structure:   Progroms,  subprograms 

Components;  Stote  (memory  cells), 
instructions,  operotors,  controls, 
interpreter 


g  -5 


Circuits:  Arithmetic  unit 


Components:  Registers,  tronsfers, 
controls,  doto  operators  (+, -,  etc.] 


Circuits:  Counters,  controls,  sequentiol 
tronsducer,  function  generotor, 
register  orroys 

Components:  Flip-flops  — ,  reset-set/ 
RS,  JK,  delay  ID.  toggle  /  7",  latch, 
deloy,  one  shot 


Circuits:  Encoders,  decoders,  transfer 
arrays,  data  ops,  selectors, 
distributors,  iterative  networks 

Components  AND.  OR.  NOT,  NAND,  NOR 


Circuits:  Amplifiers,  delays,  ottenuotors, 
multivibrotors,  clocks,  gotes,  dif  ferentiotor 

Active  components;  Reloys,  vacuum  tubes, 
tronsistors 

Passive  components:  Resistor//?,  capacitor/ 
C,  inducter/i.  diode,  delay  tines 


Components: 
states,  inputs, 
outputs 


Fig.  1.  Hierarchy  of  levels:  computer  structure. 


Part  1  I  The  structure  of  computers 


Elementai  v  circuit  theory  is  an  almost  prototypic  example.  The 
components  are  R's,  L's,  C's,  and  voltage  sources.  The  mode  of 
combination  is  to  run  wires  between  the  terminals  of  components, 
which  corresponds  to  an  identification  of  current  and  voltage  at 
these  terminals.  The  algebraic  and  differential  equations  of  circuit 
theory  provide  the  means  whereby  the  behavior  of  a  circuit  can 
be  computed  from  the  properties  of  its  components  and  the  way 
the  circuit  is  constructed. 

There  is  a  recursive  feature  to  most  system  descriptions.  A 
system,  composed  of  components  structured  in  a  given  way,  may 
be  considered  a  component  in  the  construction  of  yet  other  sys- 
tems. There  are,  of  course,  some  primitive  components  whose 
properties  are  not  explicable  as  the  resultant  of  a  system  of  the 
same  type.  For  example,  a  resistor  is  not  to  be  explained  by  a 
subcircuit  but  is  taken  as  a  primitive.  Sometimes  there  are  no 
absolute  primitives,  it  being  a  matter  of  convention  what  basis 
is  taken.  For  example,  one  can  build  logical  design  systems  from 
many  diflerent  primitive  sets  of  logical  operations  (AND  and  NOT, 
NAND,  OR  and  NOT,  etc.). 

A  system  level,  as  we  have  used  the  term  in  Fig.  1,  is  charac- 
terized by  a  distinct  language  for  representing  the  system  (that 
is,  the  components,  modes  of  combination,  and  laws  of  behavior). 
These  distinct  languages  reflect  special  properties  of  the  types  of 
components  and  of  the  way  they  combine.  Otherwise,  there  would 
be  no  point  in  adopting  a  special  representation.  Nevertheless, 
these  levels  exist  in  the  .system  analyst's  way  of  describing  the  same 


structure  Behavior 
-15  volts 


V  +  '4  -  '»  =  0  At  /•'  =  0"*;      =  0  for 

1^  =  a  ig  where  1  »  1  3-voll  step,  input 

At  t'  =  0+  e„=  0  ond  „here  e„  s.-  3.0  volts) 

0=  +  15-//7-  j'ldf 

£■^.=£^,  =  0  at  t'-O 


Fig.  2.  Electronic-circuit  level:  inverter  circuit. 


physically  existing  system.  The  fact  that  the  languages  are  highly 
distinct  makes  it  possible  to  be  confident  about  the  existence  of 
different  system  levels.  Where  we  are  fuzzy,  as  in  the  existence 
of  an  additional  intermediate  level,  it  is  because  new  representa- 
tions have  not  yet  congealed  into  distinct  formal  languages.  As 
we  noted,  within  each  level  there  exists  a  whole  hierarchy  of 
svstems  and  subsystems.  However,  as  long  as  these  are  all  described 
in  the  same  language,  e.g.,  a  subroutine  hierarchy,  all  given  in 
machine-assembly  language,  they  do  not  constitute  separate  sys- 
tem levels. 

With  this  general  view,  let  us  work  through  the  levels  of  com- 
puter systems,  starting  at  the  bottom.  Each  level  in  Fig.  1  actually 
has  two  languages  or  representations  associated  with  it;  an  alge- 
braic one  and  a  graphical  one.  These  are  isomorphic  to  each  other, 
the  same  entities,  properties,  and  relations  being  given  in  both. 

The  lowest  level  in  Fig.  1  is  the  circuit  level.  Here  the  com- 
ponents are  R's,  L's,  C's,  voltage  sources,  and  nonlinear  devices. 
The  behavior  of  the  system  is  measured  in  terms  of  voltage,  current, 
and  magnetic  flux.  These  are  continuously  varying  quantities  asso- 
ciated with  various  components,  and  so  there  is  continuous  be- 
havior through  time.  The  components  have  a  discrete  number  of 
terminals,  whereby  they  can  be  connected  to  other  components. 
Figure  2  shows  both  an  algebraic  and  graphical  description  of 
an  inverter  circuit,  as  well  as  an  algebraic  and  graphical  descrip- 
tion of  its  behavior.  We  note  that  its  structure  is  specified  first 
as  a  circuit  (a  directed  graph),  with  symbols  for  the  arcs  and  nodes. 
The  particular  circuit  still  is  an  abstraction  because  the  transistor 
Ql,  the  resistor  R,  and  the  stray  capacitors  C^.  are  given  only  token 
values.  The  structure  can  be  described  symbolically  by  first  writing 
the  relationship  describing  each  of  the  components  (i.e..  Ohm's 
law,  Faraday's  law,  etc.)  and  then  the  equation  which  describes 
the  interconnection  of  the  components  (i.e.,  Kirchhoff's  laws).  We 
observe  the  behavior  of  the  circuit  (probably  using  an  oscilloscope) 
by  applying  an  input  ei(t)  and  observing  an  output  eJX).  Alterna- 
tively, if  we  solve  the  equations  which  specify  the  structure,  we 
obtain  expressions  which  describe  the  behavior  explicitly. 

The  circuit  level  is  not  in  fact  the  lowest  level  that  might  be 
used  in  describing  a  computer  system.  The  devices  themselves 
require  a  diff^erent  language,  either  that  of  electromagnetic  theory 
or  of  quantum  mechanics  (for  the  solid-state  devices).  It  is  usually 
an  exercise  in  a  course  on  Maxwell's  equations  to  show  that  circuit 
theory  can  be  derived  as  a  specialization  under  appropriately 
restricted  boundary  conditions.  Actually,  even  at  its  level  of  ab- 
straction, circuit  theory  is  not  quite  adequate  to  describe  computer 
technology  since  there  are  a  number  of  mechanical  devices  which 
must  be  represented.  Magnetic  tapes  and  drums  are  most  likely 


Chapter  1  5 


to  come  to  mind  first,  but  card  readers,  card  punches,  and  Teletype 
terminals  are  other  examples.  These  devices  obey  laws  of  motion 
and  are  analyzed  in  units  of  mass,  length,  and  time. 

The  next  level  is  the  logic  level.  It  is  unique  to  digital  techno! - 
og\',  whereas  the  circuit  level  (and  below)  is  what  digital  technol- 
ogy shares  with  the  rest  of  electrical  engineering.  The  behavior 
of  a  system  is  now  described  by  discrete  variables  which  take  on 
only  two  values,  called  0  and  1  (or  +  and  — ,  true  and  false,  high 
and  low).  The  components  perform  logical  functions:  AND,  OR, 
NOT,  NAND,  etc.  Systems  are  constructed  in  the  same  way  as 
at  the  circuit  level,  by  connecting  the  terminals  of  components, 
which  thereby  identify  their  behavioral  values.  The  laws  of  bool- 
ean algebra  are  used  to  compute  the  behavior  of  a  system  from 
the  behavior  and  properties  of  its  components. 

The  previous  paragraph  described  combinatorial  circuits  whose 
outputs  are  directly  related  to  the  inputs  at  any  instant  of  time. 
If  the  circuit  has  the  ability  to  hold  values  over  time  (store  infor- 
mation), we  get  sequential  circuits.  The  problem  that  the  com- 
binatorial-level analvsis  solves  is  the  production  of  a  set  of  outputs 
at  time  t  as  a  fmiction  of  a  number  of  inputs  at  the  same  time  t. 
As  described  in  textbooks,  the  analysis  abstracts  from  any  trans- 
port delays  between  input  and  output;  however,  in  engineering 
practice  the  analvsis  of  delavs  is  usually  considered  to  be  still  part 
of  the  combinatorial  level.  In  Fig.  .3  we  show  a  combinatorial 
network  formed  from  combinatorial  elements  which  realize  three 
boolean  output  expressions,  O,,  O.,,  and  O3.  as  a  fimction  of  the  input 
boolean  variables  .\  and  B.  Note  that  in  the  symbolic  representa- 
tion of  the  stnicture  we  can  write  an  expression  that  reflects  the 
structure  of  the  combinatorial  network,  but,  on  reduction,  the 
boolean  equations  no  longer  reflect  the  actual  stnicture  of  the 
combinatorial  circuit  but  become  a  model  to  predict  its  behavior. 

The  representation  of  a  sequential  switching  circuit  is  basically 
the  same  as  that  of  a  combinatorial  switching  circuit,  although 
one  needs  to  add  memory  components,  such  as  a  delay  element 
(which  produces  as  output  at  time  t  the  input  at  time  t  —  t).  Thus 
the  equations  that  specif)  structure  must  be  diff^erence  equations 
involving  time.  .-Vgain,  there  is  a  distinction  (even  in  representa- 
tion) between  synchronous  circuits  and  asynchronous  circuits, 
namely,  whether  behavior  can  be  represented  by  a  sequence  of 
values  at  integral  time  points  (t  =  1,  2,  .3,  .  .  .)  or  must  deal  in 
continuous  time.  But  this  is  a  minor  variation.  Figure  4  gives  a 
sequential  logic  circuit  in  both  an  algebraic  and  a  graphical  form 
and  shows  also  the  representation  of  the  behavior  of  the  system. 

Now  it  is  clear  that  logic  circuits  are  simply  a  subspecies  of 
general  circuits.  Indeed,  to  design  the  logic  components  one  con- 
structs circuit-level  descriptions  of  them.  For  instance.  Fig.  .5 


shows  a  circuit  for  a  N.\ND  (or  NOR)  gate  plus  a  table  of  its 
behavior.  It  is  evident  that  its  behavior  corresponds  to  that  of  the 
N.\ND  gate  only  if  certain  restrictions  hold:  namely,  that  one  does 
not  look  at  the  voltage  (which  is  identified  as  the  behavior  variable 
in  the  logic  circuit)  during  certain  periods  when  it  is  transient 
("settling  down,"  to  use  the  common  phrase).  Thus  the  logic  level 
is  an  instance  of  the  circuit  level  only  in  the  same  sense  that  the 
circuit  level  is  an  instance  of  Maxwell's  equations — as  a  limiting 
case  in  which  certain  features  are  deliberately  ignored. 

One  buys  a  great  deal  from  the  specialization  to  logic  circuits, 
since  one  can  compute  the  behavior  of  circuits  at  the  logic  level 
that  are  extremely  complex  at  the  circuit  level.  The  techniques 
for  doing  so  use  an  entirely  different  mathematical  apparatus.  In 
general,  we  cross  into  another  level  when  the  representation  at 
the  previous  level  provides  information  that  is  no  longer  relevant. 
.\  lower  level  is  concerned  with  explaining  the  behavior  of  a 
certain  structure,  whereas  the  next  highest  level  takes  the  lower 
level  as  given  (a  primitive).  The  higher  level  is  concerned  not  about 
internal  behavior  but  only  how  primitives  are  combined. 

.\  glance  at  Fig.  1  shows  that  we  have  described  only  the  lower 
part  of  the  logic  level.  There  is  another  part,  called  the  register- 
transfer  level  (or  RT  level).  This  is  still  an  uncertain  level,  a  matter 


1  '    NOR  ^ 

1 

OR 

u 

^  OR 

n  r 


1_ 


=  /J  V  S 


■a  02=ff  V  T  (/I  V5) 
°  --^  A  \l  B 


03=^  (/S  VS) 

--^  A  B 


Inputs 

Outputs 

A 

B 

0, 

Oz 

0 

0 

1 

1 

1 

0 

) 

0 

1 

0 

1 

0 

1 

0 

0 

1 

1 

1 

1 

0 

or,  olternotively, 
0,=  Oy 


Fig.  3.  Comblnatorial-switchlng-circuit  sublevel  of  the  logic  level:  realiza- 
tion of  three  logic  expressions. 


Part  1  I  The  structure  of  computers 


Structure 
iNORh 


Clock 
Xr  1 
-Sum  ° 
X  1 
0 

C 
Sum 


Behavior 
I       I  I  I  L 


0  r~n  I — n  o 


Sum  --  X  SiC 

■ffinput  =  Xr 

'''input  = --*'-A^r 

1  (  A>  V  ;r ) 


''input,  ^inpuf  tat'le 


c 

Xr,X  inputs 

00 

01 

10 

1 1 

0 

1,0 

0,0 

0,1 

0,1 

1 

1,0 

0,0 

0,1 

0,1 

Sum  (output)  table 

C 

Xr,  X  inputs 

00 

01 

10 

11 

0 

0 

1 

0 

1 

1 

1 

0 

1 

0 

Fig.  4.  Sequential-switchlng-circult  sublevel  of  the  logic  level:  computa- 
tion of  X  +  1  from  serial  input  string  x. 


we  will  discuss  after  we  have  finished  describing  it.  The  com- 
ponents of  an  RT  system  are  registers  and  functional  transfers 
between  registers.  A  register  is  a  device  that  holds  a  set  of  bits.' 
The  behavior  of  the  system  is  given  by  the  time  course  of  values 
of  these  registers,  i.e.,  their  bit  sets. 

The  system  undergoes  discrete  operations,  whereby  the  values 
of  various  registers  are  combined  according  to  some  rule  and  then 
are  stored  in  another  register  (thus  "transferred").  The  law  of 
combination  may  be  almost  anything,  from  the  simple  unmodified 
transfer  (A  <—  B)  to  logical  combination  (A  <—  B  A  C)  to  arithmetic 
(A  «—  B  +  C).  Thus  a  specification  of  the  behavior,  equivalent  to 
the  boolean  equations  of  sequential  circuits  or  the  differential 
equations  of  the  circuit  level,  is  a  set  of  expressions  (often  called 
productions)  which  give  the  conditions  under  which  such  transfers 
will  be  made.  In  Fig.  6  we  give  a  picture  of  an  RT  system  to 
compute  the  sum  of  integers.  The  figure  includes  the  specification 

'This  assumes  that  the  elementary  state  variable  of  the  system  holds  a  bit 
(i.e.,  one  of  two  values,  such  as  0  or  1).  This  need  not  be;  sometimes  the 
elementary  variable  holds  a  decimal  digit  (one  of  10  values)  or  a  character 
(one  of,  say,  48  values).  For  present  purposes  we  can  talk  in  tenns  of 
bits,  without  losing  anything  thereby. 


of  its  behavior  and  a  tabl^  that  shows  the  resulting  behavior  over 
time.  Here  the  graphical  smicture  of  the  system  includes  registers 
(N,  I,  S).  transfers  (S  ^  S  data  operators  (S  +1, 1  >  N,  etc.). 
The  flowchart  shows  the  behavior  of  the  control  with  time. 

The  register-transfer  level  is  still  uncertain  because  there  is 
substantial  agreement  neither  on  the  exact  language  to  be  used 
for  the  level  nor  on  the  techniques  of  analysis  and  synthesis  that 
go  with  it.  As  we  will  note  below,  for  both  the  circuit  level  and 
the  logic-circuit  level  there  exist  well-defined  representations, 
guaranteed,  so  to  speak,  by  standard  textbooks  and  college  courses 
that  teach  these  levels.  Standard  texts  on  digital  computers  make 
only  informal  use  of  the  RT  level. 

We  have  indeed  a  systems  level  in  emergence  here.  If  one 
restricts  the  transfer  operations  to  boolean  operations  and  thinks 
of  a  register  as  simply  a  set  of  I-bit  memories,  one  can  write  a 
set  of  logic  equations  for  any  register-transfer  system.  Furthermore, 
if  one  considers  the  role  of  logic  design  in  digital  computers,  this 
has  encompassed  both  sequential  circuits  and  the  register-transfer 


Inputs 


Output 


Table  of  NOR 
behavior 


NOR  logic  element 
(Struct 


Inputs 

Output 

1  2  3 

1   1  1 

0 

1   1  0 

0 

1  0  1 

0 

1  0  0 

0 

0  t  1 

0 

0  1  0 

0 

0  0  1 

0 

0  0  0 

1 

Inputs 


Table  of  NAND 
behovior 


NAND  logic  element 
(Structure) 


Inputs 

Output 

1  2  3 

0  0  0 

1 

0  0  1 

1 

0  1  0 

1 

0  1  1 

1 

1  0  0 

1 

1  0  1 

1 

1  1  0 

1 

1  1  1 

0 

Table  of  circuit 
behavior 


Input,  volts 

Output,  volts 

1    2  3 

0   0  0 

-3 

0  0-3 

-3 

0-3  0 

-3 

0  -3  -3 

-3 

-3    0  0 

-3 

-3    0  -3 

-3 

-3  -3  0 

-3 

-3  -3  -3 

0 

Multiple  input  inverter  circuit 
(Structure) 


Fig.  5.  Change  of  representation  at  the  circuit  level  combinatorial- 
switching  sublevel  boundary. 


Chapter  1 


level.  The  practicing  logic  designer  (by  now  an  institutionalized 
position,  on  a  par  with  that  of  circuit  designer)  has  sequential  and 
combinatorial  circuits  as  his  basic  analytic  tools,  and  he  attempts 
to  design  systems  on  the  register-transfer  level  (e.g.,  central  proc- 
essors) with  these  as  tools.  The  register-transfer  level  has  emerged 
from  the  informal  attempts  to  create  a  notation  closer  to  the  job 
to  be  done. 

Recently  there  have  been  a  number  of  efforts  to  construct 
formalized  register-transfers  systems.  Most  of  them  are  built 
around  the  constniction  of  a  programming  system  or  language  that 
permits  computer  simulation  of  systems  on  the  RT  level.  .Although 
there  is  agreement  on  the  basic  components  and  types  of  opera- 
tions, there  is  much  less  agreement  on  the  representation  of  the 
laws  of  the  svstem  (corresponding  to  the  production  system  in  Fig. 


o       t  ^  start  a  ^run— (5-^0;  /— 0;  start— 0^  run— 1); 
f       r  ^  run ^((/</l/ +  1^  5-5  +  I): 
>•  (/>/V)-(run-0)); 

^  s  IS  abbreviation  for  stort 
2r  IS  abbreviotion  for  run 
^  combinationoi  network 
^  clock  event  time,  r 


Structure 

(Arr.;r=00)— (Sum  =0)!  (AV,Ar=i*)— (Sum;=*); 


(jr/;A-=00)— (Suin:=1); 
(AV;/«'=01)— (Sum  =1)  (A'/;.*r=01)— (Sum:=0); 


Next  stote  table 


Present 
state 

Xr.X  inputs. 

00 

01 

1* 

N 

N 

N 

c 

C 

N 

C 

c 

Cutout  table  (sum) 

Present 
state 

Xr,X  inputs: 

00 

01 

1* 

N 

0 

1 

* 

C 

1 

0 

* 

Time 

0 

1 

2 

3 

4 

5 

6 

7 

Xr  input 

1 

0 

0 

0 

0 

0 

0 

0 

X  input 

0 

1 

1 

1 

0 

1 

0 

0 

Stote 

N 

c 

c 

c 

c 

AC 

AC 

AC 

Sum  output 

0 

0 

0 

0 

0 

0 

'  conventions  ( condition)  — -  (output) 


Fig.  6.  Register-transfer  sublevel  of  the  logic  level:  computation  of  the 
sum  of  integers. 


Fig.  7.  State-system  representation  of  the  logic  level:  computation  of 
X  -t-  1  from  serial  input  string  x. 


6)  or  on  the  wav  to  represent  the  dynamic  behavior  (correspond- 
ing to  the  behavior  table  in  the  figure). 

There  is  another  representation  used  at  the  logic  level,  the 
state-si/stem  representation,  but  it  has  been  put  at  one  side  in  Fig. 
1.  The  state  svstem  is  the  most  general  representation  of  a  discrete 
system  available.'  A  system  is  represented  as  capable  of  being  in 
one  of  N  abstract  states  at  any  instant  of  time.  (For  digital  systems, 
N  is  finite  or  enumerable.)  Its  behavior  is  specified  by  a  transition 
function  that  takes  as  arguments  the  current  state  and  the  current 
input  and  determines  the  next  state  (and  the  concomitant  output). 
A  digital  computer  is.  in  principle,  representable  as  a  state  svstem, 
but  the  number  of  states  is  far  too  large  to  make  it  useful  to  do 
so.  Instead,  the  state  system  becomes  a  useful  representation  in 
dealing  with  various  subparts  of  the  total  machine,  such  as  the 
sequential  circuit  that  controls  a  magnetic  tape.  Here  the  number 
of  states  is  small  enough  to  be  tractable.  Thus,  we  have  placed 
state  systems  at  one  side  as  an  au.\iliarv  to  the  logic  level.  In  Fig. 
7  we  give  the  common  representations  of  the  state  system.  Co- 

'  There  have  been  energetic  attempts  to  apply  the  state-system  approach 
to  control  systems  of  a  more  general  nature  [Zadeh  and  Desoer.  196.3], 
although  they  do  not  concern  us  here. 


Part  1     The  structure  of  computers 


incidently,  we  use  the  representations  of  Fig.  7  for  the  sequential 
switching  circuit  of  Fig.  4.  That  is,  Fig.  7  may  be  viewed  as  an 
abstraction  of  the  physical  system  in  Fig.  4.  To  the  logic  designer 
the  state  system  is  a  useful  abstraction  of  a  logic  design.  A  design 
usually  passes  through  the  following  problem  representations; 

1  The  problem  exists  in  a  natural  language. 

2  The  problem  is  converted  to  a  state  diagram  (output  as 
a  fimction  of  state,  and  input). 

3  The  state  diagram  is  represented  as  a  state  table  and 
output  table. 

4  States  are  assigned  (physical  memory  elements  are  used). 

5  The  e.\citation  table  and  output  tables  are  formed. 

6  The  excitation  and  output  logic  equations  are  written 
(constrained  by  the  actual  logic  elements). 

7  The  sequential  circuit  is  drawn. 

Let  us  go  to  the  next  higher  level,  the  program  level.  This 
not  only  is  a  unique  level  of  description  for  digital  technology  (as 
was  the  logic  level)  but  is  uniquely  associated  with  computers, 
namely,  with  those  digital  devices  that  have  a  central  component 
that  interprets  a  programming  language.  There  are  many  uses  of 
digital  technology,  especially  in  instnmientation  and  digital  con- 
trols, which  do  not  require  such  an  interpretation  device  and 
hence  have  a  logic  level  but  no  program  level. 

The  components  of  the  program  level  are  a  set  of  memories 
and  a  set  of  operations.  The  memories  hold  data  structures  which 
represent  things  both  inside  and  outside  the  memory,  e.g.,  num- 
bers, payrolls,  molecules,  other  data  structures,  etc.  The  operations 
take  various  data  structures  as  inputs  and  produce  new  data  struc- 
tures, which  again  reside  in  memories.  Thus  the  behavior  of  the 
system  is  the  time  pattern  of  data  structures  held  in  its  memories. 
The  unique  feature  of  the  program  level  is  the  representation  it 
provides  for  combining  components,  that  is,  for  specifying  what 
operations  are  to  be  executed  on  what  data  structures.  This  is  the 
program,  which  consists  of  a  sequence  of  instructions.  Each  in- 
struction specifies  that  a  given  operation  (or  operations)  be  exe- 
cuted on  specified  data  structures.  Superimposed  on  this  is  a  control 
structure  that  specifies  which  instruction  is  to  be  interpreted  next. 
Normally  this  is  done  in  the  order  in  which  the  instructions  are 
given,  with  jumps  out  of  sequence  specified  by  branch  instructions. 
Again,  Fig.  8  shows  a  simple  program,  the  data  structures,  and 
the  behavior. 

Two  things  separate  the  logic  level  from  the  program  level. 
First,  computer  systems  at  the  logic  level  are  parallel  devices,  with 


all  components  active  simultaneously.  At  the  program  level,  com- 
puters are  represented  essentially  as  serial  devices.  Second,  the 
program  level,  but  not  the  logic  level,  is  essentially  linguistic  in 
nature.  At  the  program  level  things  can  be  named,  abbreviations 
can  be  used,  decisions  can  be  made,  instructions  are  interpreted 
—  all  concepts  that  are  strikingly  absent  from  physical  systems. 
Of  course,  they  are  not  "really"  absent  since  one  can  give  a  full 
description  of  the  operation  of  a  program  at  the  logic  level.  But 
one  does  so  by  carrying  in  mind  the  set  of  physical  behaviors 
discovered  for  computers  that  make  them  show  the  appropriate 
linguistic  behavior  at  the  program  level.  Thus,  one  does  not  "go 
to  ALPHA  if  accumulator  is  negative';  one  has  a  logic  circuit  that 
transfers  the  contents  of  the  address  field  of  the  instruction  register 
to  the  program  counter,  ANDing  that  transfer  with  the  sign  of 
the  accumulator,  so  that  it  does  not  take  place  if  the  accumulator 
is  not  negative.  Such  a  translation  reveals  how  distinct  is  the 
system  boundary  between  the  register-transfer  level  and  the  pro- 
gram level.  The  size  of  the  gap  is  also  revealed  in  the  ability  of 
people  to  become  expert  programmers  without  knowing  anything 
about  any  representations  below  the  programming  level. 

The  program  level  constitutes  an  entire  technology  in  its  own 
right,  and  one  that  carries  within  it  most  of  the  emergent  charac- 
teristics of  computer  systems  that  make  them  worthy  of  a  science. 
Among  the  programming  languages  alone,  there  are  levels  of  lan- 
guage which  are  so  distinct  from  each  other  as  to  constitute  system 
levels  fully  as  important  as  the  ones  exhibited  in  Fig.  1.  Never- 
theless, from  the  viewpoint  of  someone  basically  concerned  with 
hardware  systems,  these  can  all  be  accounted  a  single  level,  at 
least  for  the  present.  The  one  aspect  of  programming  systems  that 
should  be  of  most  concern,  that  of  operating  systems,  is  still  in 
such  a  fragmented  state  that  it  does  not  even  begin  to  be  a  distinct 
system  level. 

One  peculiarity  of  the  program  level  is  that  there  exists  no 
universal  representation  for  it,  as  there  does  for  the  circuit  or 
logic-circuit  level  (and,  it  is  to  be  hoped,  soon  for  the  register- 
transfer  level).  Each  machine  has  its  own  machine  language  (and 
its  own  assemblers  and  command  languages  built  on  those  ma- 
chine languages).  Each  of  these  languages  forms  a  complete  sys- 
tem at  the  program  level,  applicable  only  to  the  machine  in 
question.  There  is  no  universal  machine  language,  although  there 
is  much  in  common  at  a  conceptual  level  between  all  existing 
machine  languages.  There  has  existed  a  long-standing  attempt 
within  the  programming  field  to  develop  an  UNCOL  (for  Uni- 
versal Computer  Oriented  Language)  [Steel,  1961]  that  would 
play  this  role,  but  it  has  never  been  successful.  The  reasons  are 
not  far  to  seek.  The  role  of  the  machine  language  is  to  be  inter- 


Chapter  1 


preted  by  the  machine  in  order  to  produce  behavior.  It  is  not  free 
to  have  arbitrarily  desirable  properties  from  our  human  viewpoint, 
since  its  details  affect  the  efficient  operation  of  the  computer  too 
much  —  how  much  space  is  devoted  to  the  program,  how  much 
time  is  saved  bv  a  special  order  oriented  to  matrix  multiply,  etc. 
UNCOL  was  also  attempting  to  fill  the  same  role  as  machine 
languages,  being  one  from  which  to  compile  a  machine  code  for 
an  arbitrary  machine.  Another  reason  why  there  has  been  no 
universal  programming  representation  is  that  each  particular 
machine  language  is  a  language,  and  so  a  universal  description 
would  seem  to  be  a  description  of  a  class  of  languages.  This  is 
by  no  means  impossible,  as  the  wide  use  of  notations  such  as 
Backus  Normal  Form  (BNF)  show.'  Nevertheless,  it  has  contrib- 
uted to  the  lack  of  any  universal  notation. 

We  now  move  to  the  fourth  and  last  level.  In  Fig.  1  it  is  called 


the  Processor-Memory-Switch  level,  or  PMS  level  for  short.  The 
name  is  not  recognized,  nor  is  any  other,  since  the  level  exists 
only  informally.  Nevertheless,  its  existence  is  hardly  in  doubt.  It 
is  the  view  one  takes  of  a  computer  system  when  one  considers 
only  its  most  aggregate  behavior.  It  then  consists  of  central  proc- 
essors, core  memories,  tapes,  disks,  input/output  processors,  com- 
munication lines,  printers,  tape  controllers,  busses.  Teletypes, 
scopes,  etc.  The  system  is  viewed  as  processing  a  medium,  infor- 
mation, which  can  be  measured  in  bits  (or  digits,  characters,  words, 
etc.  I.  Thus  the  components  have  capacities  and  flow  rates  as  their 
operating  characteristics.  .\11  details  of  the  program  are  sup- 
pressed, although  many  gross  distinctions  of  encoding  and  infor- 

'We  will  propose  a  notation  later.  See  also  the  work  by  F.  Haney  in  his 
Cieneralized  histruction  System  (CIS)  [Haney,  1968). 


Stop 


PDP-8  symbolic  machine  languoge  program 


Time,  t 


Loc 
Stort 


Oper, 
do 

dco  S 
dca  I 
tad  S 
tod  I 
dca  S 
tod  N 

CIO 

tad  I 
smo  clQ 


Stop  hit 


isz  I 
jmp  loop 


I-N; 
I^Itl; 


ALGOL  progrom 
Stort 

for  I-^O,  step  1  until  N  do  S—  S  f  I , 

Stop 


Comments 

Time/I  5  MS 

Program 

AC 

I 

S 

cleor  AC 

counter 

deposit  AC  in  M, clear  AC 

0 

stort 

* 

* 

* 

1 

start  + 1 

0 

* 

* 

twos  complement  odd 

3 

stort  »  2 

0 

* 

0 

5 

loop 

0 

0 

0 

7 

loop  + 1 

0 

0 

0 

9 

loop  +  2 

0 

0 

0 

negote  AC  (in  twos  complement) 

1  1 

loop  t  3 

0 

0 

0 

13 

loop  +4 

N 

0 

0 

skip  it-AC. clear  AC 

14 

loop  +5 

-N 

0 

0 

holt 

16 

loop  t-  6 

-N 

0 

0 

index  (byD.skip  if  0 

17 

loop  +  8 

0 

0 

0 

]ump 

19 

loop*  9 

0 

1 

0 

20 

loop 

0 

1 

0 

sum»0.0t-1,  .,0+lt  ..+  N 

integers  0,1.  ,N 

15>(Nf1)+l 

volue  of  N  where : 

loop  +  6 

-N*N 

M 

It 

0<S<2" 

15«(Nf1)t-2 

stop 

0 

N 

It- 

15«(Nt1)f  3 

stop  + 1 

0 

N 

It 

.,tN 
..+N 


Fig.  8.  Programming  level:  computation  of  the  sum  of  Integers. 


Part  1     The  structure  of  computers 


mation  type  remain,  depending  on  the  analysis.  Thus  one  may 
distinguish  program  from  data,  or  file  space  from  resident  monitor. 
One  may  remain  concerned  with  the  fact  that  input  data  are  in 
alphameric  and  must  be  converted  into  binary,  or  are  bit-serial 
and  must  be  converted  to  bit-parallel. 

We  might  characterize  this  level  as  the  "chemical  engineering 
view  of  a  digital  computer,"  which  likens  it  more  to  a  continuous- 
process  petroleum-distilling  plant  than  to  a  place  where  complex 
FORTRAN  programs  are  applied  to  matrices  of  data.  Indeed,  this 
system  level  is  more  nearly  an  abstraction  from  the  logic  level 
than  from  the  program  level,  since  it  returns  to  a  simultaneously 
operating  flow  system. 

One  might  question  whether  there  is  a  distinct  systems  level 
here.  In  the  early  days  of  computers  almost  all  computer  systems 
could  be  represented  as  in  the  diagram  in  M.I.T.'s  Whirlwind 
computer  programming  manual  in  Fig.  9:  with  classic  bo.\es  of 
memory  (storage),  control,  arithmetic,  and  input/output.  Actually, 
this  view  of  the  computer  in  1953  was  considerably  advanced; 
few  texts  on  the  logic  design  of  computers  in  the  1960s  have  such 
a  detailed  model.  This  model  has  secondary  memory  (magnetic 
tape  and  drums  in  the  Whirlwind's  case).  The  most  interesting 
aspect  of  the  model,  which  text  writers  omit,  is  any  kind  of  switch- 
ing (the  bus  of  Fig.  9).  The  bus  provides  a  communication  path 
to  link  the  other  components.  Certainly  the  pushbuttons  (actually 
the  console)  is  novel  for  such  a  model.  Compare  this  with  the 
diagram  of  a  modern  computer  system  in  Fig.  10,  which  shows 
a  two-processor  UNIVAC  1108,  the  level  of  abstraction  being 
the  same  as  in  Fig.  9.  The  arithmetic  element  of  Fig.  9  has  disap- 


Arithmetic  element 


Forms- 
Sum 

Difference 
Product 
Quotient 
(Positive  or  negative 


Takes  instructions 
from  storoge 
then 
directs  all  other 
elements  properly 


Storage 


Tope 

Tape 

preparation 

Fig.  9.  Automatic  digital  computation.  (From  the  Whirlwind  Computer 
Mar\ual,  M.l.T.  By  permission  of  the  publishers.) 


peared  and  is  replaced  by  a  processor  (a  combined  control  and 
arithmetic  element)  in  Fig.  10.  The  central  control  of  Fig.  9  is  now 
distributed  throughout  the  remaining  components.  The  control  in 
Fig.  10  is  a  combined  imit  for  transforming  a  serial  character- 
information  stream  into  words.  It  also  manages  the  transmission 
of  a  word  vector  between  the  primary  memory  and  a  terminal 
or  a  secondary  memory.  The  Resource  Allocation  Diagram  is  in- 
troduced in  Fig.  10  to  describe  the  allocation  (use),  hence  be- 
havior, of  the  PMS  components  as  a  fvmction  of  time.  Chapter  2 
describes  these  figines  more  fuUv. 

Another  indication  of  the  emergence  of  the  PMS  level  lies  in 
the  models  used  in  most  operations-research  types  of  studies  on 
computer  systems.  Again,  in  the  early  1960s  these  were  practi- 
cally nonexistent.  Now,  with  the  advent  of  multiprogramming, 
multiprocessing,  and  time  sharing,  and  the  imminent  arrival  of 
computer  networks,  there  are  substantial  numbers  of  such  studies. 
The  level  of  abstraction  is  always  one  that  considers  onlv  flows 
and  stocks  of  information,  measured  in  bits  (or  an  equivalent), 
perhaps  divided  into  several  subtypes.  The  concerns  are  bottle- 
necks, capacities,  total  flow  rates,  queuing  problems,  buffer  sizes, 
and  the  like.  All  this  indicates  a  system  level  above  both  the  logic 
level  and  the  program  level. 

There  is  no  uniform  language  for  representation  at  this  level 
and  even,  as  we  noted,  no  standard  name.  We  have  used  the  term 
PMS  in  analogy  to  the  use  of  RT  for  the  register-transfer  level. 
Processors,  memories,  and  switches  are  the  main  kinds  of  com- 
ponents out  of  which  systems  at  this  level  are  built.  If  one  names 
a  number  of  components  at  the  PMS  level,  as  we  did  previously, 
one  finds  few  switches  in  the  list.  "Busses'"  in  our  list  would  be 
one,  although  many  would  think  first  of  their  data  transfer  charac- 
teristics. But,  as  this  book  amply  shows,  what  makes  the  PMS  level 
both  interesting  and  complex  is  the  existence  of  switches  which 
govern  the  pattern  of  information  flow  through  the  system.  One 
reason  why  they  seem  buried  is  their  association  with  other  com- 
ponents as  addressing  systems.  There  are  other  components  besides 
processors,  memories,  and  switches,  namelv,  links,  transducers,  and 
controls.  But  the  first  three,  P,  M,  and  S,  seem  appropriate  to 
characterize  the  level. 

It  is  not  known  whether  there  will  be  yet  other  systems  levels, 
say  one  above  the  PMS  level,  as  networks  come  into  existence. 
The  simplicity  of  the  top  level  argues  against  it,  but  that  may  only 
show  our  narrow  vision.  It  is  important  to  realize  that  these  levels 
are  not  sacrosanct.  Thev  depend  strongly  on  physical  technology. 
Thus,  as  we  move  toward  integrated  circuitry,  there  may  emerge 
representations  other  than  register-transfer  diagrams,  and  the  lat- 
ter may  never  develop  into  a  clear  systems  level.  One  could  even 


Chapter  1 


imagine  something  happening  to  the  circuit  level,  as  continuous 
distributions  became  more  important  (although  the  use  of  equiva- 
lent circuits  is  well  embedded  in  the  engineering  culture).  We  are 
not  concerned  with  predicting  any  particular  changes.  We  wish 
only  to  emphasize  that  the  system-levels  diagram  of  Fig.  1  is  a 
reflection  both  of  current  technology  and  of  our  ways  of  analyzing 
given  physical  systems.  As  such,  these  levels  have  a  certain  im- 
permanency  about  them. 

What  is  the  problem? 

The  systems  levels  we  have  just  described  correspond  to  the  tech- 
nologies that  are  available  for  the  analysis  and  synthesis  of  com- 
puter systems.  Each  of  these  levels  exists,  in  fact,  precisely  to  the 
extent  that  a  technology  has  become  well  developed.  Thus  both 
the  circuit  level  and  the  lower  half  of  the  logic  level  (combinato- 
rial and  sequential  circuits)  are  highly  polished  technologies.  Thev 
are  what  one  learns  today,  if  one  wants  to  become  a  computer  en- 
gineer. Textbooks  exist,  courses  are  taught,  and  there  is  a  flourish- 
ing, cumulative  technical  literature.  As  we  progress  up  the  systems 
levels,  matters  become  progressively  worse.  The  register-transfer 
level  is  not  yet  well  established,  although  there  is  considerable 


current  activity  in  the  area,  and  the  ne.\t  few  years  mav  see  its 
imiversal  establishment.  .Although  programming  is  certainly  well 
defined,  each  machine  is  a  king  in  his  own  court,  with  no  common 
technology  of  the  program  level  that  is  relevant  to  the  design  of 
computer  systems.  The  latter  phrase  must  be  added  since  we  are 
taking  a  very  specialized  viewpoint  here.  We  do  not  consider  the 
world  of  programming  research  at  all,  it  being  entirely  divorced 
from  computer-systems  design.'  Finally,  at  the  top,  there  is  practi- 
cally no  consensus  on  the  nature  of  the  systems  level. 

There  is  nothing  very  surprising  about  this  state  of  affairs.  It 
reflects  accurately  the  fundamental  fact  that  only  in  the  past  few 
years  have  computer  systems  become  complex  enough  for  the 
higher  levels  to  emerge  as  distinct  systems  levels.  When  most 
computers  could  be  described  in  the  diagram  of  Fig.  9 — and  such 
a  diagram  was  reprinted  innumerable  times  in  the  first  decade — 
there  was  no  need  to  have  a  technolog\'  at  the  PMS  level.  When 
registers  were  so  expensive  that  one  could  count  the  registers  of 
a  processor  on  the  fingers  of  one  hand  (no  thumbs  allowed),  one 
did  not  need  a  register-transfer  language  in  order  to  describe  the 

'This  is  not  entirely  true.  Each  level  must  provide  coupling  with  adjacent 
levels.  A  major  issue  in  computer-design  is  the  trade-off  between  hardware 
and  software. 


Graphic 


-  Pc  — T  .  conso  le 
KioC/*!  :  16)  


-  Pc  —  T .  consol  e 
Kio(#l : 16)  


—  Kio{H]  :  16)- 
Kio(#l : 16)- 


where : 


5K  - 
—  SK- 


L — Cfcards,  lines"] 
[paper  tape  _j 


SK-rS  — Ms.drum 


-skT 


SK— p5 — Ms (moving  head  drum) 


— skJ 


SK— pS— Ms  .magne  1 1  c  tape  - 


J 


-SK- 

-SK— S  — T(Telephone) 


Mp/primary  memory:  Ms/secondary  memory; 
Pc/central   processor;  T/terminal:  and  L/link 
S/switch;   K/control:   Klo/control   for  io  equipment; 
lMp(#0:7;  core;   32768  word) 


Ms(*i) 
Kio(#j) 
Pc(«2) 
Pc(#1) 

Mp 


Resource  allocation  diagram 


Time,  r 


Fig.  10.  PMS  level:  UNIVAC  1108. 


Part  1     The  structure  of  computers 


flows.  In  both  cases,  an  informal  block  diagram  conveyed  all  the 
information  adequately. 

The  question  of  the  programming  level  is  somewhat  different, 
since  this  level  has  existed  as  a  formal  language  from  the  very  start. 
Here  the  key  aspect,  it  seems  to  us,  is  that,  since  well-defined 
languages  existed,  there  was  little  pressure  to  find  a  better  one. 
The  fact  that  such  languages  were  completely  idiosyncratic  to  the 
machine,  since  they  emerged  as  a  product  of  the  design  itself, 
simply  did  not  worry  anyone  overly  much.  Each  language  provided 
a  design  framework  one  could  work  into,  and  this  seemed  to  suffice. 
It  led,  it  is  true,  to  the  game  of  "We  have  another  bit  left  in  the 
mode  field  of  the  instruction — got  another  mode  you'd  like?" 
But  this  has  only  made  computer  designers  feel  that  creating  an 
order  code  was  something  of  an  art. 

Thus  we  feel  that  the  increased  complexity  of  computer  systems 
is  making  these  higher  system  levels  of  increasing  importance. 
Since  this  is  only  the  second  decade  of  the  serious  development 
of  computer  systems,  these  upper  levels  are  not  in  very  good  shape. 
For  instance,  textbooks  devote  very  little  attention  to  the  area. 
Textbooks  (especially  good  ones)  tend  to  be  technique-oriented, 
giving  most  attention  to  what  is  known.  (When  we  were  students 
we  always  used  to  wonder  why  there  were  no  mathematics  texts 
which  told  you  about  the  problems  that  were  not  solvable  in  closed 
form.)  Thus  the  present  need  for  some  material  at  these  higher 
levels  constitutes  a  major  motivation  for  this  book. 

There  is  a  second  feature  of  the  current  scene  that  enters  into 
our  motivation  for  this  book.  Around  1,000  different  computer 
systems  have  been  built.  This  represents  a  substantial  amount  of 
pragmatic  experimentation.  This  is  especially  true  at  the  program- 
ming level  and  PMS  level,  and  also  to  some  extent  at  the  register- 
transfer  level.  Many  things  have  been  tried,  many  found  worth- 
while, and  many  found  wanting.  A  good  deal  of  reinvention  goes 
on.  Thus  we  are  concerned  that  this  history  of  experimentation 
not  be  lost.  It  is  true  that,  if  the  underlying  technology  changes 
enough,  the  experience  may  become  largely  irrelevant,  but  this 
does  not  appear  to  us  to  be  an  imminent  development. 

We  will  admit  also  to  a  third  concern,  which  does  not  stem 
from  our  role  as  computer  engineers  concerned  with  design,  but 
from  our  role  as  computer  scientists,  fascinated  with  the  phenom- 
ena of  computers.  The  variety  of  about  1,000  computers  represents 
the  beginning  of  a  proliferation  of  a  species.  It  is  not  under  biologi- 
cal control  but  rather  under  economic  and  intellectual  control. 
Nevertheless,  it  is  in  every  sense  of  the  word  an  evolutionary 
population.  We  find  ourselves  feeling  a  little  like  naturalists  must 
have  felt  when  confronted  with  the  proliferation  of  the  organic 
world.  We  were  at  one  time  tempted  to  call  this  book  "Computer 


Botany"  and  at  another  "Computer  Taxonomy."  We  feel  that  the 
attempt  to  gather,  document,  and  classify  these  existing  computers 
is  a  worthy  endeavor  in  its  own  right.  One  might  think  that  all 
this  material  is  easily  available.  But  the  record  fades  rapidly, 
especially  when  much  of  it  exists  only  as  manufacturers'  manuals 
and  papers  in  assorted  proceedings. 

The  main  reasons  for  producing  this  book  and  for  its  particular 
character  are  by  now  evident.  There  is  a  need  for  material  on  the 
upper  levels  of  computer  systems,  both  for  teaching  new  students 
of  computer  science  and  engineering  and  for  making  the  past 
record  available  for  professional  designers.  Since  the  technologies 
are  not  well  developed  for  the  upper  levels,  it  is  not  possible  to 
write  a  textbook,  making  use  only  of  well-accepted  techniques, 
njtations,  and  results.  Instead,  one  settles  for  making  available  a 
collection  of  examples  of  systems,  so  that  they  can  be  studied  and 
analyzed  directly. 

Notations 

It  remains  to  say  a  word  about  two  notations  we  have  introduced, 
both  about  our  motivations  for  doing  so  and  about  their  character. 
Some,  but  not  all,  of  this  is  already  implicit  in  the  foregoing  ac- 
count. 

We  started  simply  to  produce  a  set  of  readings  in  computer 
systems,  motivated  by  the  lack  of  detailed  examples  we  could  use 
in  a  course  one  of  us  (GB)  was  giving  on  computer  design.  As  noted, 
we  felt  the  need  to  expose  the  students  to  real  examples  of  complex 
computer  structures.  As  we  gathered  material  we  became  im- 
pressed (depressed  is  actually  a  better  term)  with  the  diversity  of 
ways  of  describing  these  higher  levels.  Even  more,  the  amount 
of  clumsy  description — downright  verbosity — even  in  purely 
technical  manuals  acted  as  a  further  depressant.  The  thought  of 
putting  such  a  congeries  of  descriptions  between  hard  covers  for 
one  person  to  peruse  and  study  was  almost  too  much  to  contem- 
plate. Gradually,  we  began  to  rewrite  and  condense  many  of  the 
descriptions.  As  we  did  so,  a  set  of  common  notations  developed. 
Becoming  aware  of  what  was  happening,  we  devoted  a  substantial 
amount  of  attention  and  effort  to  creating  notational  systems  that 
have  some  consistency  and,  we  hope,  some  chance  of  doing  the 
job  required.  These  are  the  PMS  descriptive  system  for  the  PMS 
level  (sic)  and  the  ISP  (Instruction-set  processor)  descriptive  sys- 
tem for  the  program  level.  Each  of  these  requires  some  comment 
on  its  nature  and  the  role  we  think  it  should  play. 

The  PMS  descriptive  system  is  meant  to  provide  a  notation 
for  the  top  level  of  computer  systems.  Figure  10  is  given  in  this 
notation.  On  the  surface  it  is  largely  self-explanatory,  given  the 


Chapter  1  13 


mnemonics  of  P  for  processor,  M  for  memory,  S  for  switch,  T  for 
transducer  (hence  also  terminal),  and  K  for  control  (since  C  is  for 
computer).  There  is  also  L  for  link,  but  in  most  computer  struc- 
tures it  is  unnecessary  to  distinguish  a  separate  link  component, 
except  to  show  connectivity.  (It  does  become  appropriate  if  com- 
munication delays  exist.) 

There  is  an  issue  about  whether  this  small  set  of  components 
is  an  appropriate  set  of  primitives,  but  the  issue  is  not  of  major 
proportions.  The  real  issues  in  the  development  of  the  notation 
come  from  the  stress  of  two  opposite  forces.  On  the  one  hand,  one 
wants  extremely  compact  notations  for  expressing  computer  sys- 
tems. The  systems  are  large  in  any  event,  and  if  there  is  much 
extra  notational  freight  in  the  way  of  fixed  formats,  forced  writing 
of  what  is  already  known  and  assumed,  etc.,  then  the  notation  will 
be  neither  useful  nor  used.  On  the  other  hand,  there  is  a  tremen- 
dous variety  and  quantity  of  information  that  potentially  must  be 
capable  of  being  written  into  a  description:  word  size,  capacity, 
flow,  operation  rate,  data-types,  variations  of  operation  rate  for 
different  classes  of  instructions,  parity  checking,  technology,  and 
on  and  on.  Thus  one  needs  a  notation  that  responds  to  both  these 
demands — and  without  being  hopelessly  complex  and  difficult  to 
learn.  Our  attempt  at  a  solution  involves  a  basically  simple  lan- 
guage with  comprehensive  (and  we  think  natural)  ways  of  sys- 
tematic abbreviation  and  abstraction. 

The  ISP  descriptive  system  is  meant  to  provide  a  uniform  way 
of  describing  instniction  sets,  that  is,  of  giving  the  information 
contained  in  a  programming  manual.  It  must  provide  the  instruc- 
tion format,  the  registers  referenced  by  the  instnictions,  the  rules 
of  interpretation  of  the  instruction,  and  the  semantics  of  each 
instruction  in  the  processor's  repertoire.  It  must  be  able  to  do  this 
for  any  existing  computer,  plus  the  expected  extensions  into  the 
future.  Its  homeliest  virtue  is  to  make  it  possible  to  read  the 
descriptions  of  the  forty-odd  computer  systems  described  in  this 
book,  without  having  to  fight  a  new  notation  for  each  system,  and 
still  to  know  in  detail  what  the  instructions  really  do. 

Our  attempt  at  a  solution  turns  out  not  to  be  a  generalized 
sort  of  instruction.  Rather,  it  is  very  similar  in  flavor  to  a  register- 
transfer  scheme.  The  differences  lie  in  being  able  to  suppress  all 
timing  information  and  all  detail  that  is  not  essential  to  under- 
standing the  instructions.  ISP  is  not  a  variety  of  UNCOL,  in  which 
one  can  program;  rather  it  is  a  language  in  which  one  can  describe 
what  any  particular  instruction  set  does.  We  thus  avoid  many  of 
the  pitfalls  of  the  UNCOL-like  efforts. 

There  is  a  price  to  be  paid  for  introducing  new  notations,  for 
they  must  be  learned.  We  feel  that  the  two  systems  we  have 
introduced  here  are  natural  enough  to  require  almost  no  learning 


for  superficial  use  (e.g.,  looking  at  Fig.  10)  and  only  modest 
amounts  for  full  exploitation.  They  seem  to  us  vastly  preferable 
to  the  array  of  ad  hoc  notations  that  we  were  faced  with  initially 
(and  with  which  we  almost  faced  the  reader).  Still  we  are  aware 
of  the  price. 

A  word  should  be  said  about  antecedents.  The  PMS  descriptive 
system  is  close  to  the  way  computer  scientists  talk  informalK  about 
the  top  level  of  computer  systems;  no  one  effort  in  the  environment 
stands  out  as  a  predecessor.  Some  notations,  such  as  CPU  (for 
central  processing  units),  have  become  widespread.  We  clearly 
have  assimilated  them.  Our  modifications,  such  as  Pc  instead  of 
CPU,  are  dictated  entirely  bv  the  attempt  to  build  a  consistent 
notation  over  the  whole  range  of  computer  systems.  With  respect 
to  ISP,  we  have  been  heavily  influenced  bv  the  work  on  register- 
transfer  languages.'  The  one  that  we  used  most  as  a  kernel  from 
which  to  grow  ISP  was  the  work  of  Darringer  and  Pamas  [Dar- 
ringer,  1969].  In  particular,  their  decision  to  work  within  the 
framework  of  ALGOL  suited  our  own  sensibilities,  even  though 
the  final  version  of  ISP  departs  from  a  sequential  algorithmic 
language  in  a  number  of  respects. 

Finally,  a  word  should  be  said  about  innocence  and  aspirations. 
N\'e  are  putting  P.MS  and  ISP  forward  as  two  notations.  They  are 
that.  But  they  also  imply  a  particular  view  of  digital  processing. 
Thus  they  are  not  entirely  innocent.  It  would  be  appropriate  to 
explore  fully  this  view  and  to  justify  the  particular  decompositions 
and  definitions  used.  This  is  not  to  say  that  these  views  are  pecu- 
liarly ours.  They  are  implicit  in  the  informal  use  of  similar  descrip- 
tive systems.  However,  the  attempt  to  formalize  a  notation  makes 
them  more  accessible.  We  accept  the  obligation  to  perform  such 
an  e.xploration.  But  this  volume  is  not  the  place  to  do  so,  for  that 
\sould  turn  it  into  something  between  a  treatise  and  a  textbook. 
For  this  book,  it  is  appropriate  to  take  these  notations  at  face 
value.  We  have  a  companion  volume  in  preparation  that  attempts 
the  other  job.  This  is  an  aspiration. 

We  have  other  aspirations  as  well.  .Notations  in  the  computer 
world  should  turn  into  working  tools.  There  are  many  tasks,  such 
as  the  communicative  one  of  this  book,  where  the  notation  bv  itself 
is  useful.  Others  are  easy  to  imagine:  writing  specifications  for  new 
machines;  being  sure  what  the  computer  salesmen  are  selling; 
standardization  of  programming  manuals,  so  that  learning  about 
a  new  machine  is  easier;  etc.  But  there  are  other  tasks  where  the 

'We  have  not  been  influenced  in  a  direct  way  by  the  work  of  Iverson 
[FalkofF,  Iverson,  and  Sussenguth,  1964]  in  the  sense  of  patterning  our 
notation  after  his.  Nevertheless,  his  creation  of  a  full  description  of  the 
IBM  System  .360  in  .\PL  stands  as  an  important  milestone  in  moving 
toward  formal  descriptions  of  machines. 


Part  1  I  The  structure  of  computers 


notations  must  become  formal  programming  languages,  so  that 
analysis  and  synthesis  procedures  can  be  carried  on  automatically 
in  their  terms.  As  we  have  noted,  the  development  of  ISP  and  PMS 
germinated  from  purely  notational  issues.  We  have  not  let  our 
aspirations  to  turn  them  into  simulation  languages  delay  our  use 
of  them  for  purely  descriptive  purposes.  Thus  we  accept  the  obli- 
gation also  to  develop  them  as  operational  tools.  That  is  also  an 
aspiration  and  cannot  be  dealt  with  anywhere  within  this  book. 

Plan  of  the  book 

We  now  have  enough  background  to  explain  the  structure  of  the 
book.  Two  other  chapters  complete  the  introductory  part.  Chapter 
2  provides  an  exposition  of  the  PMS  and  ISP  descriptive  systems. 
As  we  have  just  noted,  this  does  not  attempt  to  explore  seriously 
the  view  of  digital  processing  implicit  in  these  notations,  although 
it  does  provide  a  small  amount  of  motivation.  A  summary  of  the 
language  conventions  and  parameter  values  is  given  at  the  end 
of  the  book  in  the  appendix. 

Chapter  .3  provides  a  description  of  the  space  of  computer 
systems.  One  can  view  all  computer  systems  as  occupying  a  space 
whose  dimensions  are  the  various  important  systems  features. 
Many  features  of  the  actual  systems  are  relatively  locked  together. 
For  example,  word  size  and  number  of  instructions  in  the  reper- 
toire covary;  no  12-bit  machine  has  200  instructions  but  several 
with  over  32  bits  do.  Thus  the  number  of  significant  dimensions 
of  variation  is  much  less  than  the  total  number  of  features  of 
computer  systems.  Such  a  space  provides  a  basic  frame  in  which 
to  choose  representative  computer  systems  for  inclusion  in  the 
book.  We  hope  Chap.  3  will  also  justify  our  feeling  that  there  is 
a  diversity  and  proliferation  of  computer  systems  that  is  worthy 
of  serious  study. 

The  remainder  of  the  book  is  divided  into  five  parts  (2  to  6, 
with  the  introduction  constituting  Part  1),  and  each  part  into 
sections.  Each  chapter  gives  a  description  of  a  computer  system 


that  is  an  instance  of  the  part  and  section.  Usually  a  chapter 
describes  only  one  computer  or  computer  system,  although  there 
are  a  few  exceptions  in  Part  6  on  computer  families. 

A  word  needs  to  be  said  about  the  "Virtual  "  Table  of  Contents. 
Many  of  the  example  computers  are  relevant  to  more  than  one 
part  and  section.  Physically,  they  have  to  be  located  at  one  place. 
But  we  have  permited  multiple  entries  in  the  Contents,  so  that, 
for  instance.  Chap.  33  on  the  IBM  1800  appears  in  Sec.  1  of 
Part  2  as  an  example  of  a  one-address  ISP,  in  Sec.  1  of  Part  4  as 
a  terminal  control,  and  finally  in  Sec.  2  of  Part  5  as  an  example 
of  a  PMS  with  one  central  processor  and  multiple  input/output 
processors  (1  Pc,  multi-Pio);  physically  it  is  located  in  the  latter 
section.  By  using  diff^erent  type  faces  we  hope  the  reader  will  not 
become  confused  between  virtual  and  actual. 

There  is  little  point  in  outlining  the  content  of  the  various  parts 
and  sections  here.  This  is  better  done  at  the  end  of  Chap.  3  after 
the  computer  space  has  been  laid  out. 

References 

Brackets  are  used  to  enclose  author(s)  and  year  of  publication,  e.g.,  [Dar- 
ringer,  1969]  or  [Falkoff,  Iverson,  and  Sussengiith,  1964].  A  list  of  all  the 
references  in  a  chapter  is  given  in  code  at  the  end  of  the  chapter.  The 
code  refers  to  the  bibliography  at  the  end  of  the  book.  This  7-  or  8-char- 
acfer  code  is  as  follows: 

Characters  1:4    First  four  characters  of  the  last  name  of  author  (or 
first  author) 

Character  5       First  initial  of  author  (or  first  author) 
Characters  6:7    Year  of  publication  —  1900 

Character  8        (Optional)  o,  b,  c,  .  .  .  .  used  to  denote  multiple  refer- 
enced publications  of  author  in  a  year. 

References 

DarrJ69:  FalkA64;  HaneF68;  RoseS67;  SteeT61;  ZadeL6.3. 


Chapter  2 

The  PMS  and  ISP  descriptive  systems 


The  task  of  this  chapter  is  to  provide  an  introduction  to  the  PMS 
descriptive  system  for  the  top  computer-system  level  and  to  the 
ISP  descriptive  system  for  the  program  level.  We  take  the  view 
that  informal  notations  exist  and  are  in  use.  PMS  and  ISP  are  an 
attempt  to  tidy  up  these  notations — to  make  them  consistent  and 
more  powerful.  Thus  we  depend  on  the  reader  already  to  under- 
stand implicitly  much  of  the  notation  and  how  it  is  to  be  used. 
In  consequence,  there  is  no  attempt  in  this  chapter  to  provide 
a  formal  treatment  of  the  whole  system.  The  appendix  1,  at  the 
end  of  the  hook  contains  a  complete  svnnmarv  of  the  notation 
mles,  including  the  component  attributes  and  values,  and  their 
abbreviations  (i.e.,  the  main  technical  vocabulary).  We  will  pro- 
vide a  brief  discussion  of  the  conceptual  view  underlying  the  two 
systems,  since  it  is  an  appropriate  way  to  make  the  notation 
imderstandable.  But  this  is  informal  and  heuristic. 

The  two  descriptive  systems  are  not  independent.  There  is  a 
common  set  of  notational  conventions  for  abbreviating,  for  giving 
parameter  values,  and  so  on.  (The  .\ppendix  separates  them.) 
Likewise,  there  exists,  in  effect,  an  ISP  description  for  every  PMS 
component,  or,  conversely,  ISP  statements  imply  particular  PMS 
component  structures.  A  natural  way  is  to  present  PMS  first,  which 
will  also  serve  to  introduce  the  main  notational  devices.  Then  we 
will  give  ISP.  Finally,  we  will  add  more  comments  on  the  rela- 
tionship between  PMS  and  ISP. 

PMS  level  of  description 

Digital  systems  can  be  characterized  most  generally  as  systems 
that  at  any  time  exist  in  one  of  a  discrete  set  of  states  and  that 
undergo  discrete  changes  of  state  with  time.  This  is  a  highly  ab- 
stract view.  Nothing  is  said  about  what  physical  state  corresponds 
to  a  system  state;  nothing  is  said  about  what  laws  of  physics  trans- 
form the  system  from  one  state  to  another.  The  states  are  given 

abstract  labels:  Sj,  S.,  The  transitions  are  provided  by  a 

state-transition  table  with  many  entries  of  the  form:  If  the  system 
is  in  state  Sj  and  the  input  is  1^,  then  the  system  is  transformed 
to  state  S^  and  evokes  output  O,.  (Alternatively,  a  state  diagram 
has  the  same  information.)  The  virtue  of  this  "state-system" 
view  is  that  it  tmlv  seems  to  capture  what  we  mean  by  a  dis- 
crete (or  digital)  system.  Its  disadvantage  lies  in  this  same  com- 
prehensiveness, which  makes  it  impossible  to  deal  with  large 


systems  because  of  their  immense  number  of  states  (of  the  order 
of  10'"  "'  states  for  a  big  computer).' 

Existing  digital  computers  can  be  viewed  as  discrete  state 
systems  that  are  specialized  in  three  ways.  These  three  speciali- 
zations make  possible  a  much  more  compact  and  usef\il  description 
of  these  systems,  the  one  that  we  call  the  PMS  description. 

First,  the  state  is  realized  by  a  medium,  called  information, 
which  is  stored  in  memories.  Thus,  a  core  store  of  N  words  each 
of  .32  bits  is  a  digital  device  that  can  exist  in  one  of  2-''^^  states.  Sim- 
ilarly, all  the  states  of  a  processor  are  made  ex-plicit  in  a  set 
of  registers;  an  acciunnlator,  an  address  register,  an  instniction 
register,  status  register,  etc.  Each  holds  a  specified  number  of  bits. 
No  permanent  information  is  kept  in  digital  devices  except  as 
encoded  in  bits  in  a  memon.'.  There  are  two  qualifications  to  this 
blanket  statement.  First,  the  basic  unit  of  information  need  not 
be  the  bit;  it  could  be  any  base:  One  can  have  temar)'  machines, 
decimal  machines,  etc.  Second,  the  sequential  logic  circuits  that 
carry  out  operations  in  the  system  have  intermediate  states.  But 
this  is  a  stricth  temporary  affair  while  the  operation  is  occurring, 
for  example,  the  intermediate,  inaccessible,  partial  results  during 
a  multiply  operation.  At  the  end — when  the  smoke  has  cleared, 
so  to  speak — all  information  carried  over  to  the  next  operation 
has  been  encoded  into  bits  in  memories  somewhere.  .\t  the  PMS 
level  we  care  onlv  about  the  end  result  of  such  operations. 

The  second  specialization  of  the  general  state-system  view  is 
that  current  digital  computer  systems  consist  of  a  small  number 
of  discrete  subsystems  linked  together  by  flows  of  information. 
There  is  a  distinct  component  called  the  memory,  another  called 
the  central  processor,  another  called  the  card  reader,  etc.  This 
is  analogous  to  the  lumped-parameter  specialization  at  the  circuit 
level.  Thus  the  natural  representation  of  a  digital  computer  system 
is  as  a  graph  which  has  component  systems  at  the  nodes  and 
information  flows  as  branches.  Now,  in  fact,  the  discrete  character 
of  digital  encoding  in  bits  prevents  there  being  any  truly  continu- 
ous digital  devices  (in  analogy  to  the  continuously  distributed 
parameter  circuits).  But  one  can  have  distributed  networks  with 
very  small  components.  Such  iterated  arrays  are  a  topic  of  much 

'  As  «e  noted  in  Fig.  1  of  Chap.  1,  we  actually  describe  some  parts  of 
the  control  mechanisms  of  computers  bv  state-system  diagrams;  however, 
these  are  exceedingly  small  pieces.  .\n  example  ma\'  be  seen  in  Fig.  T  on 
page  7. 


15 


16  Part  1  I  The  structure  of  computers 


current  investigation,  as  the  possibility  of  manufacturing  them  by 
integrated-circuit  techniques  has  emerged.  These  distributed  net- 
works look  very  different  from  the  computer  systems  of  today, 
although  they  are  still  digital  systems.  Thus,  the  representation 
as  a  flow  network  with  functionally  specialized  nodes  is  a  real 
specialization. 

The  third  specialization  of  the  general  state-system  viewpoir^t 
is  that  associated  with  each  component  in  a  digital  system  iS'  a 
small  number  of  discrete  operations  for  changing  its  own  state  or 
the  state  of  neighboring  components.  All  transitions  must  occur 
through  the  application  of  these  few  operations,  which  are  evoked 
as  a  function  of  the  current  state  of  the  component.  The  total 
behavior  of  the  system  is  built  up  from  the  repeated  execution 
of  the  operations  as  the  conditions  for  their  execution  become 
realized  by  the  results  of  prior  operations.  The  general  state-system 
view  is  more  general.  The  state-transition  table  for  a  system  may 
exhibit  an  arbitrary  pattern  of  immediate  state  transitions,  without 
regard  to  how  such  transition  would  be  physically  realized. 

To  summarize,  within  this  specialized  view  one  wants  a  way 
of  describing  a  system  of  an  interconnected  set  of  components, 
which  are  individual  devices  that  have  associated  with  them  a  set 
of  operations  that  work  on  a  medium  of  information,  measured 
in  bits  (or  some  other  base). 

The  major  complication  in  this  picture  is  the  amount  of  detail 
involved  in  describing  actual  computers.  It  takes  a  whole  manual, 
for  instance,  to  describe  the  operations  of  a  major  computer,  such 
as  the  IBM  7090.  Thus  the  descriptive  system  must  permit  very 
compressed  descriptions.  It  must  also  permit  description  of  only 
those  aspects  of  the  components  that  are  of  interest,  ignoring  the 
rest.  And  what  is  of  interest  at  the  PMS  level?  Besides  a  description 
of  the  gross  structure  of  a  computer  system,  it  is  primarily  the 
analysis  of  the  amounts  of  information  held  in  various  components, 
the  flows  of  information  between  components,  and  the  distribution 
of  the  control  that  accomplishes  these  flows. 

Thus  a  PMS-level  description  is  analogous  to  the  chemical 
engineer's  diagram  of  a  refinery  in  which  he  is  interested  in  various 
kinds  of  liquid  and  gas  flow.  He  has  to  account  for  matter  and 
energy  loss  with  the  system  at  various  stages  involving  the  trans- 
duction of  materials  from  one  form  to  another.  A  specific  chemical 
plant's  external  performance  is  measured  in  terms  of  its  production 
flow  rate  for  a  given  cost.  With  computers,  external  performance 
is  concerned  with  the  economical  accomplishment  of  discrete 
tasks,  but  at  the  PMS  level  this  translates  into  operation  rates  and 
cost  of  operations. 

For  the  PMS  level  we  ignore  all  the  fine  structure  of  informa- 
tion processing  and  consider  a  system  consisting  of  components 
1 


that  work  on  a  homogeneous  medium  called  information.  Infor- 
mation comes  in  packets,  called  i-units  (for  information  units),  and 
is  measured  in  bits  (or  equivalent  units,  such  as  characters).  I-units 
have  the  sort  of  hierarchical  structure  indicated  by  the  phrase:  A 
record  consists  of  300  words;  a  word  consists  of  4  bytes;  a  byte 
consists  of  8  bits.  A  record,  then,  contains  300  x  4  X  8  = 
9,600  bits.  Each  of  these  numbers — 300,  4,  8 — is  called  a  length, 
since  one  often  thinks  of  an  i-unit  as  a  spatial  sequence  of 
the  next  lower  i-units  of  which  it  is  composed.  For  example, 
one  speaks  of  "word  length  "  and  of  a  record  being  "300  words 
long." 

Other  than  being  decomposable  into  a  hierarchy  of  factors, 
i-units  have  no  other  structure  at  the  PMS  level.  They  do  have 
a  referent,  that  is,  a  meaning.  Thus  it  is  possible  to  say  of  an 
i-unit  that  it  refers  to  an  employer's  payroll,  to  the  pressure  of 
a  boiler,  or  to  a  prime  number  satisfying  certain  conditions.  To 
do  so,  of  course,  the  i-units  encode  the  information  necessary  to 
make  the  reference.  At  the  PMS  level  we  are  not  concerned  with 
what  is  referred  to,  but  only  with  the  fact  that  certain  components 
transform  i-units  but  do  not  modify  their  meaning.  In  fact,  these 
meaning-preserving  operations  are  the  most  basic  information- 
processing  operations  of  all,  and  thev  provide  the  basic  classi- 
fication of  computer  components. 

PMS  primitives 

In  PMS  there  are  seven  basic  component  types,  each  distinguished 
by  the  kinds  of  operations  it  performs: 

Memory,  M.  A  component  that  holds  or  stores  information 
(i.e.,  i-units)  over  time.  Its  operations  are  reading  i-units  out 
of  the  memory  and  writing  i-units  into  the  memory.  Each 
memory  that  holds  more  than  a  single  i-unit  has  associated  with 
it  an  addressing  system  by  means  of  which  particular  i-units 
can  be  designated  or  selected.  A  memory  can  also  be  consid- 
ered as  a  switch  to  a  number  of  subniemories.  The  i-units  are 
not  changed  in  any  way  by  being  stored  in  a  memory. 

Link,  L.  A  component  that  transfers  information  (i.e.,  i-units) 
from  one  place  to  another  in  a  computer  system.  It  has  fixed 
ports.  The  operation  is  that  of  transmitting  an  i-unit  (or  a 
sequence  of  them)  from  the  component  at  one  port  to  the 
component  at  the  other.  Again,  except  for  the  change  in  spatial 
position,  there  is  no  change  of  any  sort  in  the  i-units. 

Control,  K.  A  component  that  evokes  the  operations  of  other 
components  in  the  system.  All  other  components  are  taken  to 
consist  of  a  set  of  discrete  operations,  each  of  which,  when 
evoked,  accomplishes  some  discrete  transformation  of  state. 


Chapter  2  |  The  PMS  and  ISP  descriptive  systems  17 


With  the  exception  of  a  processor,  P,  all  other  components  are 
essentially  passive  and  require  some  other  active  agent  (a  K) 
to  set  them  into  small  episodes  of  activity. 

Switch,  S.  A  component  that  constructs  a  link  between  other 
components.  Each  switch  has  associated  with  it  a  set  of  possible 
links,  and  its  operations  consist  of  setting  some  of  these  links 
and  breaking  others. 

Transducer,  T.  A  component  that  changes  the  i-imit  used  to 
encode  a  given  meaning  (i.e.,  a  given  referent).  The  change  may 
involve  the  medium  used  to  encode  the  basic  bits  (e.g.,  voltage 
levels  to  magnetic  flux,  or  voltage  levels  to  holes  in  a  paper 
card),  or  it  may  involve  the  structure  of  the  i-unit  (e.g.,  bit-serial 
to  bit-parallel).  Note  that  T's  are  meaning-preserving  but  not 
necessarily  information-preserving  (in  mujiber  of  bits),  since  the 
encodings  of  the  (invariant)  meaning  need  not  be  ecjualh  opti- 
mal. 

Data-operation,  D.  A  component  that  produces  i-units  with 
new  meanings.  It  is  this  component  that  accomplishes  all  the 
data-operations,  e.g.,  arithmetic,  logic,  shifting,  etc. 

Processor,  P.  A  component  that  is  capable  of  interpreting  a 
program  in  order  to  execute  a  sequence  of  operations.  It  consists 
of  a  set  of  operations  of  the  types  already  mentioned — M,  L. 
K,  S,  T,  and  D — plus  the  control  necessary  to  obtain  instnic- 
tions  from  a  memory  and  interpret  them  as  operations  to  be 
carried  out. 


Throughout  PMS  (and  ISP,  too)  an  operation  is  taken  to  mean 
a  transformation  of  bits  from  one  specific  memory  to  another.  For 
instance,  it  is  an  operation  to  transmit  a  word  of  information  from 
memory  M  to  memory  M';  it  is  a  different  operation  to  transmit 
a  word  from  memory  M'  to  M".  Similarly,  it  is  an  operation  to 
add  the  contents  of  memory  M  to  that  of  M'  and  a  different 
operation  to  add  the  contents  of  M'  to  M". 

The  reason  for  emphasizing  this  point  is  that  one  often  talks 
as  if  addition  were  an  operation,  ignoring  the  specific  locus  of  the 
operands.  In  a  discussion  of  computer  systems,  an  operation  riiust 
include  specification  of  the  locus  of  its  operands.  The  reason  is 
that  the  physical  devices  that  realize  operations  are  always  local- 
ized in  space.  If,  for  instance,  we  wish  to  have  a  physical  device 
that  corresponds  to  addition  on  operands  an\~where  in  some  mem- 
ory, we  must  couple  the  physical  device  that  adds  with  other 
devices  that  either  transmit  information  to  and  from  the  memory 
to  the  adder  or  (more  exotic)  that  modify  the  adder  to  have  differ- 
ent cells  of  memory  as  its  terminals.  Thus  the  symbol  -I-  is  to  be 
taken  as  an  incomplete  specification  of  an  operation. 


Computer  model  {in  PMS) 

Components  of  the  seven  types  can  be  connected  to  make  storcd- 
program  digital  computers,  abbreviated  by  C.  For  instance,  the 
classical  configuration  for  a  computer  is 

C  :  =  Mp— Pc— T— X 

Here  Pc  indicates  a  central  processor  and  Mp  a  primari/  nicinon/. 
namely,  one  which  is  directly  accessible  from  a  P  and  holds  the 
program  for  it.  T  is  a  transducer  connected  to  the  external  environ- 
ment, represented  by  X.  (The  colon-equals  (:  =  )  indicates  that  C 
is  the  name  of  what  follows  to  the  right.)  Thus  a  computer  is 
a  central  processor  connected  to  its  primary  memory  on  the  one 
hand  and  to  a  transducer  on  the  other,  which  is  what  an  input/ 
output  device  is. 

.•\ctuallv  (he  classic  diagram  had  four  components,  since  it 
decomposed  the  Pc  into  a  control  (K)  and  an  arithmetic  unit  or 
data-operation  (D): 


Mp-  K— TlMs— X 
I 

D 


Mp  — D  — TlMs  — X 


where  the  solid  information-carrying  lines  are  for  instructions  and 
their  data,  and  the  dotted  lines  signify  control. 

Often  logic  operations  were  lumped  with  control,  instead  of 
with  data  operations,  but  this  no  longer  seems  to  be  the  appro- 
priate way  to  decompose  the  system  functionally. 

If  we  associate  local  control  of  each  component  with  the  ap- 
propriate component,  we  get 


Pc  :  = 


Mp- 


data 


I nstruct ions 


K(Mp)- 


M| processor 
state 


K(p)  K(T) 


where  the  solid  lines  carry  the  information  in  which  we  are  inter- 
ested, and  the  dotted  lines  carry  information  about  when  to  evoke 
operations  on  the  respective  components.  The  solid  information- 

'The  ■■  I  "  expresses  mutually  exclusive  alternatives.  Here,  a  T  or  Ms  e.vists 
at  the  periphery. 


Part  1  I  The  structure  of  computers 


carrying  lines  between  K  and  Mp  are  instructions.  Now,  suppress- 
ing the  K's,  then  lumping  the  processor  state  memory,  the  data 
operators,  and  the  control  of  the  data-operations,  and  processor 
state  memory  to  form  a  central  processor,  we  again  get 

Mp— Pc— T— X 

Computer  systems  can  be  described  in  PMS  at  varying  levels 
of  detail.  For  instance,  in  the  diagrams  above  we  did  not  write 
in  the  links  (L's)  as  separate  components.  These  would  be  of  inter- 
est only  if  the  delays  in  transmission  were  significant  to  the  dis- 
cussion at  hand  or  if  the  i-units  transmitted  by  the  L  were  different 
from  those  available  at  its  terminals.  Since  this  is  not  usually  the 
case  in  current  computers,  one  indicates  simply  that  two  com- 
ponents (e.g.,  an  Mp  and  a  Pc)  are  connected  together.  Similarly, 
often  the  encoding  of  information  into  i-units  is  unimportant;  then 
there  is  no  reason  to  show  the  T's.  The  same  statement  holds  for 
K's.  Sometimes  one  wants  to  show  the  locus  of  control,  say  when 
there  is  one  control  for  many  components,  as  in  a  tape  controller, 
but  often  this  is  not  of  interest.  Then  there  is  no  reason  to  show 
K's  in  a  PMS  diagram. 

As  a  somewhat  different  case,  D's  never  occur  in  PMS  diagrams 
of  computers,  since  in  the  present  design  technology  D's  occur 
only  as  subcomponents  of  P's.  If  we  were  to  make  PMS-type 
diagrams  of  analog  computers,  D's  would  show  extensively  as 
multipliers,  summers,  integrators,  etc.  There  would  be  few  mem- 
ories and  variable  switches.  The  rather  large  patchboard  would 
be  represented  as  a  very  elaborate  manually  fixed  switch. 

Components  are  often  decomposable  into  arrangements  of 
other  components.  Thus,  most  memories  are  composed  of  a 
switch — the  addressing  switch — and  a  number  of  submemories. 
Thus  a  memory  is  recursively  defined.  The  decomposition  stops 
with  the  unit  memory,  which  is  one  that  stores  only  a  single  i-unit 
and  hence  requires  no  addressing.  Likewise,  a  switch  is  often 
composed  of  a  cascade  of  one-way  to  n-way  switches.  For  example, 
the  switch  that  addresses  a  word  on  a  multiple-headed  disk  might 
look  like 

—  S  (random)  —  5  ( random)  — S  ( )  i  near)  —  S  (cyclic)  —  M  (word) 
\  \  \  \ 

The  first  S(random)  selects  a  specific  Ms.disk„drive„unit;  the  sec- 
ond S  (random)  is  a  switch  with  random  addressing  that  selects  the 
head  (hence  the  platter  and  side);  S(linear)  is  a  switch  with  linear 
accessing  that  selects  the  track;  and  S(cyclic)  is  a  switch  with 
cyclic  addressing  that  finally  selects  the  M(word)  along  the  circular 


track.  Note  that  the  switches  are  realized  by  differing  technologies. 
The  first  two  S(random)'s  are  generally  electronic  (AND-OR  gates) 
with  selection  times  of  10  ~  100  microseconds  or  perhaps  electro- 
mechanical (relay).  The  S(linear)  is  Uie  electromechanical  action 
of  a  stepping  motor  or  a  pneumatic-driven,  servomechanism- 
controlled  arm  which  holds  the  read-write  heads;  the  selection 
time  for  a  new  track  is  50  ~  500  milliseconds.  Finally,  the  S(cyclic) 
is  determined  by  the  rotation  time  of  the  disk  and  requires  from 
16  60  milliseconds,  depending  on  the  speed  (3,600  ~  1,000 
rpm). 

We  can  write  such  decompositions  of  a  component  into  sub- 
components either  when  we  actually  know  the  structure  of  the 
component  or  even  when  we  know  onlv  the  behavior,  For  example, 
we  could  write  a  memory  as  random  access  (M. random)  even  if 
it  was,  in  fact,  cyclic,  as  long  as  its  behavior  as  far  as  the  larger 
system  was  concerned  took  no  account  of  its  cyclic  character, 
accepting  the  average  access  time  as  the  random-access  time. 

When  people  speak  of  the  control  element  of  a  computer,  they 
often  refer  mainly  to  the  processors — not  to  the  control  of  a  disk 
or  magnetic  tape,  which,  however,  can  often  be  more  complex. 
When  we  suppress  detail,  the  control  often  disappears  from  a  PMS 
diagram.  Similarly,  when  we  agglomerate  primitive  components 
(as  we  did  above  when  combining  Mp  and  K(Mp)  to  be  just  Mp) 
into  the  physically  distinct  subparts  of  a  computer  system,  a  sepa- 
rate control,  K,  often  occurs.  The  functionally  and  physically 
separate  control'  has  evolved  in  the  past  decade.  These  controls, 
often  as  big  as  a  Pc,  can  be  computers  with  stored  control  pro- 
grams. When  we  decompose  a  compound  control,  we  find  data- 
operations  (D)  for  calcvilating  addresses  or  for  error  detection  and 
error  correction  data;  transducers  (T)  for  changing  logic  signal 
levels  and  information  flow  widths;  memory  (M)  as  it  is  used  in 
D,  T,  K,  and  for  buffering;  and  finally  a  large  control  (K)  which 
coordinates  the  activities  of  all  the  other  primitives. 

It  should  be  clear  from  the  above  discussion  that  components 
are  named  according  to  the  function  they  perform  and  that  they 
can  be  composed  of  many  different  types  of  components.  Thus, 
a  control  (K)  may  have  memory  (M)  as  a  subcomponent,  and  a 
memory  M  may  have  a  transducer  (T)  as  well  as  a  switch  (S)  as 
subcomponents.  All  these  subcomponents  exist  to  accomplish  the 
total  function  of  the  component  and  do  not  make  the  component 
also  some  other  type.  For  instance,  the  M  that  does  a  transduction 
(T)  from  voltages  on  its  input  wires  to  magnetism  in  its  cores  and 
a  second  transduction  from  magnetism  to  voltages  on  its  output 
wires  does  not  thereby  become  a  transducer  as  far  as  the  total 
'  A  variety  of  names  for  K's  are  used:  controller,  adapter,  channel,  buffer, 
interface,  etc. 


Chapter  2  |  The  PMS  and  ISP  descriptive  systems  19 


system  functioning  is  concerned.  To  the  rest  of  the  system  all  the 
M  can  do  is  to  remember  i-units,  accepting  and  delivering  them 
in  the  same  form  (voltages).  In  the  .Appendix  at  the  end  of  this 
book  we  define  for  each  type  both  a  simple  component  and  a 
compound  component,  reflecting  in  part  this  fact  that  complex 
subsystems  can  be  put  together  to  perform  a  single  function  from 
the  viewpoint  of  the  total  system.  For  example,  a  typewriter  may 
have  4~6  simple  information  transduction  channels. 

PMS  notation 

In  the  above  discussions  we  used  various  notations  to  designate 
additional  specifications  for  a  component,  for  example,  Mp  for  a 
functional  classification,  and  S( cyclic)  for  a  type  of  access  function. 
There  are  many  other  additional  specifications  one  wants  to  give — 
so  many  that  it  makes  no  sense  to  enumerate  them  all  in  advance. 
A  fixed  position  notation,  such  as  standard  function  notation, 
F(x,y,z),  where  the  first,  second,  and  third  argimient  places  have 
fixed  interpretation,  is  not  suitable.  Instead  we  agree  on  a  single 
general  way  of  providing  additional  specifications.  If  X  is  a  com- 
ponent, we  can  write 
X(ai:Vi;a2:V2;  ...  .) 

to  indicate  that  X  is  further  specified  bv  attribute  a,  having  value 
Vj,  attribute  a,  having  value  v^,  etc.  Each  parameter  (as  we  call 
the  pair  a:v)  is  well  defined  independently  of  whatever  other 
parameters  are  given;  hence  there  is  no  significance  to  the  order 
in  which  they  are  written  or  the  number  which  have  to  be  written. 

According  to  this  notation  we  should  have  written  M(function: 
primary)  or  S(access-fimction:randoni)  rather  than  Mp  or  S(ran- 
dom).  This  shows  immediately  the  price  paid  for  the  general 
convention;  It  requires  an  excessive  amount  of  writing  (which 
would  be  even  more  apparent  if  a  large  number  of  parameters 
were  given),  and  the  extra  information  seems  to  be  redundant  in 
some  cases.  We  compensate  for  these  disadvantages  by  several 
conventions  for  abbreviating  and  abstracting  parameters.  .\11  these 
conventions  are  listed  in  the  Appendix.  Let  us  illustrate  them  by 
showing  some  alternative  wavs  of  writing  Mp: 

M(function:primary)       Complete  specification. 

M(primary)  Drop  the  attribute  "fimction,"  since 

it  can  be  inferred  from  the  value. 

M. primary  Use  the  value  outside  the  parentheses, 

concatenated  with  a  dot. 

M.p  Use  an  explicitly  given  abbreviation, 

namely,  primary/p  (only  if  it  is  not 
ambiguous). 


Mp  Drop  the  concatenation  marker  (the 

dot),  if  it  is  not  needed  to  recover  the 
two  parts  (all  components  are  given 
by  a  single  capital  letter — here  M). 

Each  of  these  niles  corresponds  to  a  natural  tendencv  to  abbreviate 
when  redundant  information  is  given;  each  has  as  its  condition 
that  recovery  must  be  possible. 

In  the  full  description  in  the  appendix  each  component  is 
defined  and  given  a  large  number  of  parameters,  i.e.,  attributes 
with  their  domain  of  values.  Throughout,  we  use  the  slash  (/)  to 
introduce  abbreviations  or  aliases  as  we  go.'  Thus  p  is  introduced 
as  an  abbreviation  for  "priniar\"  by  writing  primarv  p  when 
"primary"  is  given  as  one  of  the  values  of  the  attribute  "fimction" 
of  a  memory  with  respect  to  processors  (see  page  607).  The  list 
of  parameters  in  the  Appendix  does  not  exhaust  those  aspects  of 
a  component  that  one  might  want  to  talk  about.  For  instance,  there 
are  many  distinct  dimensions  for  any  component  in  addition  to 
the  information  dimension:  packaging,  phvsical  size,  phvsical  lo- 
cation, energ)'  use,  cost,  weight,  style  and  color,  reliabilit\ ,  main- 
tainability, etc.  Furthermore,  each  of  these  dimensions  includes 
an  entire  set  of  parameters,  just  as  the  information  dimension 
breaks  out  into  the  set  of  parameters  we  have  given  in  the  .Appen- 
dix. Thus  the  descriptive  system  is  an  open  one,  and  new  param- 
eters are  definable  at  any  occasion. 

The  very  large  number  of  parameters  provides  one  of  the  major 
challenges  to  creating  a  viable  scheme  to  describe  computer  sys- 
tems. We  have  responded  to  this  in  part  b\  providing  automatic 
ways  in  which  one  can  compress  the  descriptions  bv  appropriate 
abbreviation  while  still  avoiding  a  highlv  cryptic  encoding  of  each 
separate  aspect.  Abstraction  is  another  major  area  in  which  some 
conventions  can  help  to  handle  the  large  numbers  of  parameters. 
It  often  happens  that  one  has  only  imperfect  information  about 
an  attribute,  or  one  wishes  to  give  its  value  onlv  appro.ximatelv 
or  partially.  For  instance,  one  attribute  of  a  processor  is  the  time 
taken  by  its  operations.  This  attribute  can  be  defined  with  a  com- 
plex value: 

Pc(operation-times;  add:4  jus,  store:4  |us,  load:4  jus, 
multiply:  16  jus.  .  .  .1 

That  is,  the  value  is  a  list  of  times  for  each  separate  operation. 
However,  one  might  wish  to  give  onlv  the  range  of  these  numbers; 

'There  is  no  difficulty  in  distinguishing  this  use  from  the  use  of  the  slash 
as  a  division  sign:  the  latter  takes  priority,  since  it  is  the  more  specific 
use  of  the  slash. 


Part  1     The  structure  of  computers 


this  is  done  without  introducing  a  new  attribute  (i.e.,  operation- 
tiine-range)  simply  by  indicating  that  the  value  is  a  range: 

Pc(operation-tinne:  4  ~16  fts) 

Similarly,  one  could  have  given  typical  times  or  average  times 
(under  some  assumed  frequency  mix  of  instructions): 

Pc(operation-time:  4  fis) 
Pc(operation-time:  average:  8.1  jas) 

The  primary  advantage  of  this  notational  convention,  which  per- 
mits descriptions  of  values  to  be  used  in  place  of  actual  values 
whenever  desired,  is  that  it  keeps  the  number  of  attributes  that 
have  to  be  defined  much  smaller  than  otherwise. 

A  PMS  example  using  the  DEC  PDP-8 

Let  us  now  describe  the  PMS  structure  of  an  actual,  though 
small,  general-purpose  computer,  the  DEC  LINC-8,  which  is  a 
PDP-8  with  a  LINC  processor.  Figure  1  gives  the  detailed  PMS 
diagram.  In  explaining  it,  we  will  concentrate  on  making  the 
notation  clear  rather  than  on  discussing  substantive  features  of  the 
system  (which  are  described  in  Chap.  5).  A  simplified  PMS  diagram 
of  the  system  shows  its  essential  structure; 


• T. console- 


Hp— S-pPc  5-,-T- 

:M5 


:P . d  i  spl ay- 


_  Pc('LINC)-j^Ms- 


This  shows  the  basic  Mp-Pc-T-X  structure  of  a  C  with  the  addition 
of  a  secondary  memory  (Ms)  and  two  processors,  one  of  which, 
Pc('LINC),  has  its  own  Ms.  Two  switches  are  used:  the  I/O  Bus 
which  permits  access  to  all  the  devices,  and  the  Data  Break  to 
Mp  via  Pc  for  high-data-rate  devices.  There  are  many  other 
switches  in  the  actual  system,  as  one  can  see  from  Fig.  1;  for 
example,  Mp  is  really  one  to  eight  separate  modules  connected 
by  a  switch  S  to  Pc.  Also  there  are  many  T's  connected  to  the 
input/output  switch,  Sio,  which  we  collapsed  as  a  single  T,  and 
similarly  for  S('  Data  Break). 

Consider  the  Mp  module.  The  specifications  assert  that  it  is 
made  with  core  technology,  that  its  word  size  is  13  bits  (12  data 
bits  plus  one  other  with  a  different  fiuiction);  that  its  size  is  4,096 


words;  and  that  its  operation  time  is  1.5  /iS.  We  could  have  written 
the  same  information  as 

M(function:priniary;  technologv;core;  operation-time:  1..5  jus; 
size:  4096  w;  word:  (12  -|-  1)  b) 

In  Fig.  1  we  wrote  only  the  values,  suppressing  the  attributes,  since 
moderate  familiarity  with  memories  permits  an  immediate  infer- 
ence about  what  attributes  are  involved.  For  example,  it  is  com- 
mon knowledge  that  computer  memories  store  information  in 
words;  therefore  4096  w  must  be  the  number  of  words  in  the 
memory.  As  another  example,  we  did  not  specify  the  function  of 
the  additional  bit  in  the  word  when  we  wrote  (12-1-  1)  b.  An 
informed  reader  will  assume  this  to  be  a  parity  bit,  since  this  is 
the  common  reason  for  having  an  extra  bit  in  a  word.  If  the  extra 
bit  had  some  unusual  function,  we  would  have  needed  to  define 
it.  That  is,  in  the  absence  of  additional  information,  the  most 
common  interpretation  is  to  be  assumed. 

In  fact,  we  could  have  been  even  more  cryptic  and  still  com- 
municated with  most  readers: 

M.core(1.5  /is/w;  4  kw;  12  b) 

This  corresponds  to  the  phrase  "A  12-bit,  1.5-fis,  4k  core  store," 
which  is  intelligible  to  any  computer  engineer.  The  4  kw  stands 
for  4  X  1,024  =  4,096,  which  again  is  known  to  computer 
engineers;  however,  if  someone  less  informed  took  it  to  be  4  x 
1,000  =  4,000,  no  real  harm  would  be  done. 

Con.sider  the  magnetic  tapes  for  Pc.  Since  there  are  eight 
possible  tapes  that  make  use  of  the  same  controller,  K,  through 
a  switch  S,  we  label  them  #0  through  #7.  Actually,  #  is  an 
abbreviation  for  index,  which  is  an  attribute  like  any  other,  whose 
values  are  integers.  Since  the  attribute  is  a  unique  character,  we 
do  not  have  to  write  #:3  (although  we  could).  The  additional 
parameters  give  information  about  the  physical  attributes  of  the 
encoding.  These  are  alternative  values,  and  any  tape  has  only  one 
of  them.  We  u.se  a  vertical  bar  (  |  )  to  indicate  this  (as  in  BNF 
notation  for  grammars).  Thus,  75 1 112  in/s  says  that  one  can  have 
a  tape  with  a  speed  of  75  inches  per  second  or  one  with  112  inches 
per  second,  but  not  a  tape  which  can  be  switched  dynamically 
to  run  at  either  speed. 

For  many  of  the  components  no  further  information  is  given. 
Thus,  knowing  that  M.magnetic„tape  is  connected  to  a  control 
and  from  there  to  the  Pc  tells  generally  what  that  K  does.  It 
is  a  "tape  controller"  which  evokes  all  the  actions  of  the  tape, 
such  as  read,  write,  rewind;  therefore  these  actions  do  not  have 
to  be  done  by  Pc.  The  fact  that  there  is  only  one  K  for  many  Ms's 
implies  that  only  one  tape  can  be  accessed  at  a  time.  Other  infor- 


Chapter  2  |  The  PMS  and  ISP  descriptive  systems 


MpOi'Oi?)- 


Data  Break; 
Direct 

'DM01  Data 
Mul t 1 p lexor ; 
rad  i  a  1  ; 
from:   7  P.K; 
to:  Mp 


punch^ 


- T . consol e  - 

-TCTeletype;   10  char/s;  8  b/char;  G^t  char) - 

T paper  tape;   {reader;  300  char/s)|(pu 
100  char/s) ;  8  b/char 
-Trincremental  point  plot;  300  point/s;   .01  -• 

|_in/point  -J 
-Tfcard;   reader:  200l8aO  card/min)* 
-T(card;  punch;   100  card/min)-* 

printer;  300  line/mln;   120  col/line; 
16'*  char/col 


TCRT:  display; 
30  us/point; 


K(«l 


!5r*0:7;  tiEC 
Ll33  us/w; 

HsT^O:?  ;  mag 
L200,556,8 


--P(disp1ay;  '338)- 


'  Laboratory 
I nstrument 
Computer/LINC 


area:  10  x  10  ir 
01  1,005  in/point 

 T(liqht;  pen).--' 

  KDataphone;   1.2  ~1|.8  kb/s)- 

10) — L(analog;  output;  0  ~  -10  volts)-> 

S  L(i!'0:63;  analog;  input;  0  ~ -10  volts)^ 

S         K{(»0:63;  Teletype;  110,  180  b/s)- 

^ECuCape:  addressable  magnetic  tape; 

length:  260  ft;  350  char/in;  3  b/char 
magnetic  tape;  36      l75  h  12.5  in/s? 
800  b/in;  6  18  b/char 
Hsr/^0:3;  fixed  head  disk;  tdelay:  0  ~  17  ms; 
(66   ws/w;  32768  w)|(l6   us/w;  262U'.  w)  ; 
Ln2,l  parity)  b/w 

2 

T(i»0:3;  CRT;  display:  area:  10  x  10  in  )- 

T(*0:3;  light;  pen)^' 
T(#0:3:  push  buttons;  console)*- 
T . conso 1 e 

HspyO;!;  LINC^tape;  addressable  magnetic  tape: 

L6.25  kw/s:  2''  ■■ 
T(,«0:15;  knob 
T(CRT;  display;  5x5 
T(digital;   input,  output)- 

■TCData  Terminal  Panel;  digital;   input,  output)- 


knobs ,   analog;  Input) 
2,  ^ 


'MpCcore;   1.5  \js/w;  I1O96  w:    (12  +  l)b) 
'Memory  Bus) 

^Pc(l   .^2  w/ i  ns  t  rue  t  i  on  ;  data;  w,   i,bv;   12  b/w;  M. processor  state '2^  3^) 

antecedents:   POP-5;  descendants;  PDP-8S,  PDP-8I.  POP-L) 
"SCI/O  Bus;  from:  Pc;  to;  61|  K)  '  ft]''     7  j 

"K'l  —  ^  instructions;  M.bufferd  char~2  w)) 


-/;   technnlony:  transistors; 


Fig.  1.  DEC  LINC-8-PDP-8  PMS  diagram. 


Part  1      The  structure  of  computers 


mation  could  be  given,  although  that  just  provided  is  all  that  is 
usual  in  specifying  a  controller  in  an  overall  description  of  a  sys- 
tem. (The  next  level  of  detail  goes  to  the  structure  of  the  actual 
operations  and  instructions  and  belongs  to  the  ISP  level,  not  the 
PMS  level.) 

We  have  used  several  different  ways  of  saying  the  same  thing 
in  Fig.  1  in  order  to  show  the  range  of  descriptive  notations.  Thus 
the  64  Teletypes  are  shown  by  describing  a  single  connection 
through  a  switch  and  putting  the  number  of  links  in  the  switch 
above  the  connecting  line. 

Consider,  finally,  the  Pc  in  Fig.  1.  We  have  given  a  few  param- 
eters: the  data-types,  the  processor  state,  the  descendants,  etc. 
These  few  parameters  hardly  define  a  processor.  Several  other 
important  parameters  are  easily  inferred  from  the  Mp.  The  basic 
operation  time  in  a  processor  is  a  small  multiple  of  the  read  time 
of  its  Mp.  Thus  it  is  predictable  that  Pc  stores  and  reads  informa- 
tion in  2  X  1.5  [is  (one  for  instruction  fetch,  one  for  data  fetch). 
Again,  where  this  is  not  the  case  (as  in  the  CDC  6600)  it  is  neces- 
sary to  say  so.  Similarly,  the  word  size  in  the  Pc  is  the  same  as 
the  word  size  of  the  Mp:  12  data  bits.  More  generally,  the  Pc  must 
have  instructions  that  take  care  of  evoking  all  the  components  of 
the  PMS  structure.  These  instructions  do  not  see  the  switches  and 
controls  as  distinct  entities;  rather,  they  speak  directly  to  the  oper- 
ation of  the  M  s  and  T's  connected  via  these  switches  and  controls. 

Other  summary  parameters  could  have  been  given  for  the  Pc. 
None  of  them  would  come  close  to  specifying  its  behavior 
uniquely,  although  to  those  knowledgeable  in  computers  still  more 
can  be  inferred  from  the  parameters  given.  For  instance,  knowing 
both  the  data-types  available  in  a  Pc  and  the  number  of  instruc- 
tions, one  can  come  very  close  to  predicting  exactly  what  the 
instructions  are.  Nevertheless,  the  way  to  describe  a  Pc  in  full 
detail  is  not  to  add  larger  and  larger  numbers  of  summary  param- 
eters. It  is  more  direct  and  more  revealing  to  develop  a  description 
at  the  level  of  instructions,  which  is  the  ISP  description. 

Let  us  end  this  introduction  to  the  PMS  descriptive  system  by 
returning  to  a  critical  item  in  its  design  philosophy.  A  descriptive 
scheme  for  systems  as  complex  and  detailed  as  digital  computers 
must  have  the  ability  to  range  from  extremely  complete  to  highly 
simplified  descriptions.  It  must  permit  highly  compressed  descrip- 
tions as  well  as  extensive  ones  and  must  permit  the  selective 
suppression  or  amplification  of  whatever  aspects  of  the  computer 
system  are  of  interest  to  the  user.  PMS  attempts  to  fulfill  these 
criteria  by  providing  simple  conventions  for  detailed  description 
with  additional  conventions  that  permit  abbreviation  and  abstrac- 
tions, almost  without  limit.  The  result  is  a  notation  that  may  seem 
somewhat  fluid,  especially  on  first  contact  in  such  a  brief  intro- 


duction as  this.  But  once  assimilated,  PMS  seems  to  allow  some 
of  the  flexibility  of  natural  language  within  enough  notational 
controls  to  enhance  communication  considerably. 


ISP  level  of  description 

The  behavior  of  a  processor  is  completely  determined  by  the 
nature  and  sequence  of  its  operations.  This  sequence  is  completely 
determined  by  a  set  of  bits  in  Mp,  called  the  program,  and  a  set 
of  interpretation  rules  that  specify  how  particular  bit  configura- 
tions evoke  the  operations.  Thus,  if  we  specify  the  nature  of  the 
operations  and  the  rules  of  interpretation,  the  actual  behavior  of 
the  processor  depends  solely  on  the  particular  program  in  Mp  (and 
also  on  the  initial  state  of  data).  This  is  the  level  at  which  the 
programmer  wants  the  processor  described — and  which  the  pro- 
gramming manual  provides — since  he  himself  wishes  to  determine 
the  program.  Thus  the  ISP  (Instruction-set  processor)  description 
must  provide  a  scheme  for  specifying  any  set  of  operations  and 
any  rules  of  interpretation. 

Actually,  the  ISP  descriptive  scheme  need  only  be  general 
enough  to  cover  some  broad  range  of  possibilities  adequate  for 
past  and  current  generations  of  machines  along  with  their  likely 
descendants.  As  we  saw  earlier  when  discussing  the  PMS  level, 
there  are  certain  restrictions  that  can  be  placed  on  the  nature  of 
a  computer  system,  specializing  it  from  the  more  general  concept 
of  a  discrete  state  system.  It  processes  a  medium,  called  informa- 
tion; it  is  a  system  of  discrete  components  linked  together  by 
information  transfers;  and  each  component  is  characterized  by  a 
small  set  of  operations.  These  assumptions  are  built  into  the  PMS 
descriptive  scheme  in  an  integral  way.  Similarly,  for  the  ISP  level 
we  can  add  two  more  such  restrictions,  which  will  in  turn  provide 
the  shape  of  its  descriptive  scheme. 

The  first  specialization  is  that  a  program  can  be  conceived  as 
a  distinct  set  of  instructions.  Operationally,  this  means  that  some 
set  of  bits  is  read  from  the  program  in  Mp  to  a  memory  within 
P,  called  the  instruction  register,  M.instruction/M.i.  This  set  of 
bits  then  determines  the  immediately  following  sequence  of  oper- 
ations. Only  a  single  operation  may  be  determined,  as  in  setting 
a  bit  in  the  internal  state  of  the  P;  or  a  substantial  number  of 
operations  may  be  determined,  as  in  a  "repeat"  instruction  that 
evokes  a  search  through  Mp.  In  a  typical  one-  or  two-address 
machine  the  number  of  operations  per  instruction  ranges  from  two 
to  five.  In  any  event,  after  this  sequence  of  operations  has  occurred, 
the  next  instniction  to  be  fetched  from  Mp  is  determined  and 
obtained.  Then  the  entire  cycle  repeats  itself. 


Chapter  2     The  PMS  and  ISP  descriptive  systems  23 


The  cycle  of  activity  we  have  just  described  is  called  the  inter- 
pretation cycle,  and  the  part  of  the  P  that  performs  it  is  called 
the  interpreter.  The  effect  of  each  instruction  can  be  expressed 
entirely  in  terms  of  the  information  held  in  memories  at  the  end 
of  the  cycle  (plus  any  changes  made  to  the  outside  world).  During 
execution,  operations  may  have  internal  states  of  their  own  as 
sequential  circuits  which  are  not  represented  as  bits  in  memories. 
But  by  the  end  of  the  interpretation  cycle,  whatever  effect  is  to 
be  carried  on  to  a  later  time  has  been  staticized  in  bits  in  some 
memory.' 

The  second  additional  specialization  is  on  the  data-operations. 
A  processor's  total  set  of  operations  can  be  divided  into  two  parts. 
One  part  contains  those  necessary  to  operate  other  components 
given  in  the  PMS  diagram:  links,  switches,  memories,  transducers, 
etc.  The  operations  associated  with  these  components  and  the 
extent  to  which  they  can  be  indirectly  controlled  from  P  are  highly 
restrained  by  the  basic  nature  of  the  components  and  their  con- 
trols. The  second  part  contains  those  operators  associated  with  a 
processor's  D  component.  So  far  we  have  said  nothing  at  all  about 
them,  except  to  exclude  them  completelv  from  all  PMS  com- 
ponents except  P.  These  are  the  operations  that  produce  bit  pat- 
terns with  new  meaning — that  do  all  the  '"real "  processing  or 
changing  of  information.-  If  it  were  not  for  data-operations,  the 
svstem  would  merely  transmit  information.  .\s  we  noted  in  our 
original  definitions  (page  17)  a  P  (including  a  Dl  is  the  onl\  com- 
ponent capable  of  directlv  changing  information.  .\  P  can  create, 
modify,  and  destroy  information  in  a  single  operation.  As  we  noted 
earlier,  D's  are  like  the  primitive  components  in  an  analog  com- 
puter. Later,  when  we  e.vpress  instniction  sets  as  simple  arithmetic 
expressions,  the  D's  are  the  primitive  operators,  for  example, 

'This  description  holds  tnie  for  a  P  with  a  single  active  control  ithe  inter- 
preter). Some  P's  (e.g.,  the  CDC  66(K))  have  several  active  controls  and 
get  involved  in  "overlapping  "  several  instructions  and  in  reordering  opera- 
tions according  to  the  data  and  devices  available.  With  these,  a  more 
complex  statement  is  required  to  express  the  same  general  restriction  we 
have  been  stating  for  simple  P's:  that  the  program  can  be  decomposed  into 
a  sequence  of  bit  sets  (the  instructions),  each  of  which  has  local  control 
over  the  beha\'ior  of  the  P  for  a  limited  period  of  time,  with  all  interinstruc- 
tion  eifects  being  staticized  as  bits  in  M  s. 

-In  principle,  this  view  that  only  D  components  do  "real"  processing  is 
false.  It  can  be  shown  that  a  universal  Turing  machine  can  be  built  from 
M,  S,  L,  and  K  components.  The  key  operation  is  the  write  operation  into 
M,  which  suffices  to  construct  arbitrarv  bit  patterns  under  suitably  con- 
trolled switches.  Hence  arbitrary  data  operations  can  be  built  up.  The  stated 
view  is  correct  in  practice  in  that  the  data-operations  provided  in  a  P  are 
highh'  efficient  for  their  bit  transformations.  Only  the  foolish  add  integers 
in  a  modem  computer  bv  table  look-up. 


+  ,  —,  X,  /,  X  2°,  A,  V,  @,  concatenation,  etc.,  which  are  evoked 
by  the  instruction-set-interpreter  part  of  a  processor. 

The  specialization  is  that  all  the  data-operations  can  be  char- 
acterized as  working  on  various  data  types.  For  example,  there 
is  a  data-t\  pe  called  the  signed  integer,  and  there  are  data-opera- 
tions that  add  two  signed  integers,  subtract  them,  multiply  them, 
take  their  absolute  value,  test  for  which  of  the  two  is  greater,  etc. 
.\  data-type  is  a  compound  of  two  things:  the  referent  of  the  bit 
pattern  (e.g.,  that  this  set  of  bits  refers  to  an  integer  in  a  certain 
range)  and  the  representation  in  the  bit  pattern  (e.g.,  that  bit  31 
is  the  sign,  and  bits  .30  to  0  are  the  coefficients  of  successive  I, 
powers  of  2  in  the  binary  representation  of  the  integer).  Thus  |l 
a  processor  may  have  several  data-types  for  representing  numbers: 
unsigned  integers,  signed  integers,  single  precision  floating  point, 
double  precision  floating  point,  etc.  Each  of  these  is  a  distinct 
data-type,  because  it  requires  distinct  operations  to  process  it.  On 
occasion,  operations  for  several  data-types  may  all  be  encoded  into 
a  single  instruction  with  a  data-type  subfield  that  selects  whether 
the  data  are  fixed  or  floating  point.  The  operations  are  still  sepa- 
rate, no  matter  how  packaged,  and  so  their  data-tvpes  remain 
distinct.  i 

With  these  two  additional  specializations — instructions  and  |j 
data-types — we  can  define  an  ISP  description  of  a  processor.  A  !j 
processor  is  completely  described  at  the  ISP  level  by  giving  its  | 
in.struction  .let  and  its  interpreter  in  terms  of  its  operations,  data-  \ 
types,  and  memories.  \ 

Let  us  concentrate  first  on  the  instniction  set,  leaving  the  1 
interpreter  until  later.  The  effect  of  each  instruction  is  described  | 
b\  an  in.stnictiun-e.vpression.  which  has  the  fonn  i 
1-  ■  . 

condition     action-sequence  J 

The  condition  describes  when  the  instruction  will  be  evoked,  and  | 

the  action-sequence  describes  what  transfoniiations  of  data  take  | 

place  between  what  memories.  The  right  arrow  i— >)  is  the  control  \ 
action  (of  a  K)  of  evoking  an  operation. 

Recall  that  all  operations  in  a  computer  svstem  result  in  modi-  ' 

fications  of  bits  in  memories.  Thus  each  action  in  a  sequence  j 

ultimately  has  the  form  i' 

li 
i; 

meinorv-expression  <—  data-expression 

The  left  arrow  i<— )  is  the  transmit  operation  of  a  link  and  corre-  j 

sponds  to  the  .\LGOL  assign  operation.  The  left  side  must  describe  j 

the  memory  location  that  is  affected;  the  right  side  must  describe  ,j 

the  information  pattern  that  is  to  be  placed  in  that  memory  loca-  j 

tion.  The  details  of  data  expressions  and  memory  expressions  are  | 

patterned  on  standard  mathematical  notation  and  are  communi-  1 


! 


Part  1  I  The  structure  of  computers 


cated  most  easily  by  examples.  The  same  is  true  of  the  condition, 
which  is  a  standard  expression  involving  boolean  values  and  rela- 
tions among  memory  contents. 

Before  we  get  to  the  examples,  let  us  note  two  features  of  the 
action  sequence.  The  first  is  that  each  action  in  the  sequence  may 
itself  be  conditional,  i.e.,  of  the  form,  "condition  action-se- 
quence." The  second  is  that  some  actions  are  sequentially  de- 
pendent on  each  other,  because  the  result  of  one  is  used  as  an 
input  to  the  other;  on  other  occasions  a  set  of  actions  are  inde- 
pendent and  can  occur  in  parallel.  The  normal  situation  is  the 
parallel  one.  Thus,  in  the  action  sequence 

Yj  «—  Xj;  Yj  <—  X,;  Y3  •t-  X3;  Yj  <—  X4 

all  the  transfers  of  information  may  be  considered  simultaneous. 
In  particular,  all  the  X's  have  their  values  defined  by  the  situation 
before  the  transfer.  For  example,  if  A  and  B  are  two  registers,  then 

(A^B;  B^A) 

exchanges  the  contents  of  A  and  B.  When  sequence  is  required, 
the  term  "next"  is  used;  thus 

(A  ^  B;  next  B  ^  A) 

transfers  the  contents  of  B  to  A  and  then  transfers  it  back  to  B, 
leaving  both  A  and  B  holding  the  original  contents  of  B  (and  so 
this  contrived  example  is  essentially  just  A  <^  B). 

An  ISP  example  using  the  DEC  PDP-8 

The  memories,  operations,  instructions,  and  data-types  all  need 
to  be  declared  for  a  processor.  Again  these  are  most  easily  ex- 
plained by  example,  although  full  definitions  are  given  in  the 
Appendix  at  the  end  of  the  book.  Consequently,  let  us  examine 
the  ISP  description  of  the  Pc  of  the  PDP-8,  given  in  Fig.  2  (the 
PDP-8  is  explained  fully  in  Chap.  5).  Throughout  the  book  the 
ISP  descriptions  of  computers  follow  a  more  highly  structured 
format  than  the  ISP  notation  requires,  in  order  to  help  the  reader 
see  the  similarities  among  the  computers. 

Processor  state.  We  first  need  to  specifv  the  memories  of  the  Pc 
in  detail,  providing  names  for  the  various  bits.  Thus, 

AC<0:11)       the  accumulator 

is  a  memory  called  AC,  with  12  bits,  labeled  at  0  and  II  from 
the  left.  Comments  are  given  in  italics^ — in  this  case  that  AC  is 

'There  are  a  few  features  of  the  notation,  such  as  the  use  of  itahcs,  which 
are  not  easily  carried  over  into  current  computer  character  sets.  Thus,  the 
ISP  of  Fig.  2  is  a  publication  language. 


called  the  accumulator  (by  the  designers  of  the  PDP-8).  AC  corre- 
sponds to  an  actual  register  in  the  Pc.  However,  the  ISP  does  not 
imply  any  particular  implementation,  and  names  may  be  assigned 
to  various  sets  of  bits  purely  for  descriptive  convenience.  The  colon 
is  used  to  denote  a  range  or  list  of  values.  Alternatively,  we  could 
have  listed  each  bit,  separating  the  bit  names  by  commas,  as 

AC<0,I,2,.3,4,5,6,7,8,9,10,I  I> 

Having  defined  a  second  memorv,  L  (which  has  only  a  single  bit), 
one  could  define  a  combined  register,  LAC,  in  terms  of  L  and 
AC  as 

LAC<L,0:I1>:  =  LDAC 

The  colon-equal  {:  —  )  is  used  for  definition,  and  the  middle  square 
box  (□)  denotes  concatenation.  Note  that  the  bit  named  L  of 
register  LAC  merely  happens  to  correspond  to  the  1-bit  L  register. 

Primary  memory  state.  In  dealing  with  addressed  memory,  either 
Mp  or  various  forms  of  working  memory  within  the  processor,  we 
need  to  indicate  multidimensional  arrays.  Thus 

Mp[0:7777g]<():ll> 

gives  primary  memory  as  consisting  of  lOOOOg  (i.e.,  base  8)  words 
of  12  bits  each,  being  addressed  as  indicated.  Such  an  address  does 
not  necessarily  reflect  the  switching  structure  through  which  the 
address  occurs,  though  it  often  will.  (Needless  to  say,  it  reflects 
only  addressing  space,  and  not  how  much  actual  M  is  available 
in  a  PMS  structure.)  In  general,  only  memory  within  the  processor 
will  occur  as  operands  of  the  processor's  operators.  The  one  ex- 
ception is  primary  memory  (Mp),  which  was  defined  as  a  memory 
external  to  a  P  but  directly  accessible  from  it. 

In  writing  memories  it  is  natural  to  use  base  10  for  all  numbers 
and  to  consider  the  basic  i-unit  of  the  memory  to  be  a  bit.  This 
is  always  assumed  unless  otherwise  indicated.  Since  we  used  base 
8  numbers  above  for  specifying  the  addressing  range,  we  indicated 
the  change  of  number  base  by  a  subscript,  in  standard  fashion. 
If  a  unit  of  information  other  than  the  bit  were  to  be  used,  we 
would  subscript  the  angle  brackets.  Thus 

Mp[0;7777s]<0;I>« 

reflects  the  same  memory.  The  choice  carries  with  it,  of  course, 
some  presumption  of  organization  in  terms  of  base  64  characters, 
but  this  would  show  up  in  the  specification  of  the  operators  (and 
is  not  true,  in  fact,  of  the  PDP-8).  We  can  also  have  multi- 
dimensional memories  (i.e.,  arrays),  though  no  examples  occur  in 


Chapter  2     The  PMS  and  ISP  descriptive  systems  25 


Fig.  2.  These  add  the  extra  dimensions  with  an  extra  pair  of  brack- 
ets, for  example, 

M[a:b][c:d].-.[g:h]<x:y> 

The  PDP-8  memory  might  better  be  described  as: 

Mp[0:7][0:31][0:127]<():ll> 

representing  8  memory  fields  with  32  pages  per  field,  128  words 
per  page,  and  12  bits  per  word. 

Instruction  format.  It  is  possible  to  have  several  names  for  the 
same  set  of  bits;  e.g.,  having  defined  instruction<():  1 1)  we  define 
the  format  of  the  instruction  as  follows: 

op<0:2>  :=  instniction<0;2> 
indirect„bit/ib  :=  instniction<3> 
page„()„bit/p:  =  instruction<4) 
page„address<():6)  :=  instruction<5:ll> 

The  colon-equal  (:  =  )  is  used  to  allow  us  to  assign  names  to  various 
parts  of  the  instruction.  In  effect,  we  are  making  a  definition  which 
is  equivalent  to  the  conventional  diagram  for  the  instruction: 


op 
I  I 

p 

page^ddress 


0 

3 

5  11 

L 

—  page,jO,jb  i  t 

- i  nd  i  rect^b  i  t 

Notice  that  in  page^address  the  names  of  all  the  bits  have  been 
shifted,  e.g.,  page_address<4)  :=  instruction<9). 

The  Appendix  gives  the  permissible  alphabet  of  symbols  for 
ISP.  In  general,  a  "name  "  can  be  any  combination  of  uppercase 
and  lowercase  letters  and  numerals,  not  including  names  which 
would  be  considered  numbers  (integers,  mixed  numbers,  fractions, 
etc.).  A  compound  name  can  be  sequences  of  names  separated  bv 
spaces  (  ).  In  order  to  make  certain  compound  names  more  reada- 
ble, a  space  symbol  may  optionally  be  used  to  signify  the 
non-printing  character.  Periods  (.)  and  hyphens  (-)  are  also  used. 

r/i£'  instruction  set.  With  all  the  registers  defined,  we  can  give 
the  instructions.  These  are  shown  on  the  second  page  of  Fig.  2 
(there  are  some  unexplained  parts  left  on  the  bottom  of  the  first 
page,  to  which  we  will  return).  The  second  page  is  actually  a  single 
expression,  named  Instruction^execution,  which  consists  of  a  list 
of  instructions.  They  are  listed  vertically  down  the  page  for  ease 
of  reading.  Each  instruction  consists  of  a  condition  and  an  action 


sequence,  separated  by  the  condition  arrow  i—r).  In  this  case  the 
condition  is  an  expression  of  the  form  (op  =  octal  digit).  Recall 
that  op  is  instruction<():2),  and  so  this  expresses  the  condition  that 
the  operation  code  of  the  machine  have  a  particular  value.  Each 
condition  has  been  given  a  name  in  pa.ssing;  e.g.,  "and"  is  the  name 
of  (op  =  0).  This  provides  the  correspondence  between  the  opera- 
tion code  and  the  mnemonic  name  of  the  operation  code.  If  this 
correspondence  had  been  established  elsewhere,  or  if  we  did  not 
care  what  numerical  operation  code  the  "and  "  instruction  is,  we 
could  have  written 

iind->{.\C<-.\C  A  M[z]) 

We  would  not  ha\e  known  what  condition  the  name  "and"  stood 
for  but  could  have  surmised  (with  little  difficulty)  that  it  was 
simply  an  equality  test  on  the  operation  code.  We  will  do  this 
on  a  number  of  the  ISP  descriptions  later  in  the  book.  Most  gener- 
ally the  form  of  an  instruction  is  written  as 

two's  complement  add/tad(:=  op  =  1)^ 

(LjAC  ^LDAC  +  M[z]) 

Here,  we  simultaneou.sly  define  the  action  of  the  tad  instruction, 
its  name,  an  abbreviation  for  the  name,  and  the  conditions  for  tad's 
execution.  The  parentheses  are,  in  effect,  a  remark  to  allow  an 
inline  definition.  For  example,  the  above  single  ISP  statement  is 
equivalent  to 

two's  complement  add  tad (LZi  AC  <— LD  AC  -1-  M(z]) 
followed  by 

tad  :  =  (op  =  1) 

All  the  instnictions  in  the  list  constitute  the  total  instruction 
repertoire  of  the  Pc.  Since  all  the  conditions  are  disjoint,  one  and 
only  one  condition  will  be  satisfied  when  a  given  instruction  is 
interpreted;  hence  one  and  only  one  action  sequence  will  occur. 
Actually,  all  operation  codes  might  not  be  present,  and  so  there 
would  be  some  illegal  op  codes  that  would  evoke  no  action  se- 
quence. The  act  of  selection  is  usually  called  operation  decoding. 
.\gain.  ISP  implies  no  particular  mechanism  bv  which  this  is  car- 
ried out.  Normally  a  logic  circuit  works  directly  on  the  op  part 
of  the  instruction  register,  and  the  way  op  codes  are  assigned  is 
significant  for  the  complexity  of  this  decoding  circuit.  Thus,  some- 
times one  exhibits  the  instructions  in  a  two-dimensional  decoding 
diagram  that  makes  it  evident  what  these  bit  patterns  are  (see  Fig. 
2  in  Chap.  .5),  rather  than  in  a  linear  list. 

It  might  be  wondered  why  we  do  not  in  general  introduce  some 


Part  1  I  The  structure  of  computers 


Pc  state 

AC<0: 1  1> 

Accumu  lator 

L 

Link  bit/AC  extension  for  overflow  and  carry 

PC<0:  1  \> 

Program  Counter 

Run 

1  when  Pc  is  interpreting  instructions  or  "running" 

1  n te  rrupt  tate 

IO^ulse>jl  ;    1  0  jJul  56  J  ; 

lO^pulse  Jt 

TO  puZs&s  to  TO  devices 

Mp  State 

Extended  memory  is  not  included. 

M[0:7777g]<0:n> 

Page^O[0:  177g]<0:  1  1> 

:=  M[0:  177g]<0:  n> 

special  array  of  directly  addressed  memory  registers 

Auto^index[0:7]<0:  1  1>  :=  Page  ^0  [  1  0  gi  1  7  g  ]<0  :  1  1  > 

special  array  when  addressed  indirectly ^is  incremented  by  1 

Pc  Console  State 

Keys  for  starts  stopy  continue ^  examine  (load  from  memory)^ 

and  deposit  (store  in  memory)  are  not  included. 

Data  swi  tches<0: 1 1> 

data  entered  via  console 

Instruction  Format 

in5truction/i<0:  1  ]> 

op<0:2> 

=  i<0:2> 

op  code 

indi  rect^bit/ib 

=  i<3> 

Oj  direct j  1  indirect  memory  refer&nce 

page^O^b  i  t/p 

=  i<i.> 

0  selects  page  0;  1  selects  this  page 

page^address<0 :  6> 

=  i<5:  n> 

this^page<0:'(> 

=  PC'<0:'l> 

PC'<0: 1 1> 

=   (PC<0: 1 1>  -1 ) 

lO^se  lecKO:  5> 

=  i<3:8> 

selects  a  T  or  Ms  device 

io^pl^b  i  t 

=   i<l 1> 

these  3  bits  control  the  selective  generation  of  -3  volts ^ 

i  o^p2^b  i  t 

=  i<10> 

0.  4  \LS  pulses  to  I/O  devices 

\ojphJ>\t 

=  i<9> 

sma 

=  i<5> 

p,  bit  for  skip  on  minus  ACj  operate  2  group 

sza 

=  i<6> 

Ij,  bit  for  skip  on  zero  AC 

snl 

=  i<7> 

^  bit  for  skip  on  non  zero  Link 

Effective  Address  Calculation  Process 

z<0:ll>  :=  { 

effecti ve 

-,ib  ^  2"; 

ib  A  (lOg  <  z"  <  17g) 

(M[z"]  ^M[z"]  +  l|,  next^^ 

auto  indexing 

ib  M[z"]) 

z'<0:  1  1>  :=   (-lib-,  z"; 

ib  -,  M[z"]) 

z"<0:n>  :=  (page^O^bit 

— '  th  i  s^pagenpage^address ; 

direct  address 

— ipage^O^b  1 1 

Oopage^address ) 

\i  microcoded  instruction  or 

instruction  bit(s)  within  an  instruction 

Fig.  2.  DEC  PDP-8  ISP  description. 


Chapter  2     The  PMS  and  ISP  descriptive  systems  27 


Instruotion  Interpretation  Process 

Run  A  ^ 

(interrupt^request  A  Interrupt^state)  -*  ( 

no  interrupt  interpreter 

instruction  — M[PC];  PC  <- PC  +  1;  next 

fetch 

i  ns  t  rue  t  i  on,_jexecut  ion)  ; 

execute 

Run  A  in 

terrupt^request  A  Interrupt^state  ( 

interrupt  interpreter 

M[0] 

-PC;   Interrupt^state  <-0;  PC  ^  1 ) 

Instigation  Set  and  Instruction  Execution  Process 

Instruct 

on^execution   ;=  ( 

and  { 

=  op  =  0)  -.  (AC       AC  A  H[z]); 

togi  "-I  I  t:  ■ 

tad  ( 

=  op  =  1  )  -  (UdAC  "  IXlAC  +  H[  z] )  ; 

tv  '                >:r  add 

i52  ( 

=  op  •  2)  -•  (M[z']  ^H[z]  +  1;  next 

ir.                     :    if  zero 

(M[z']  =  0)  ^  (fC  ^PC  t 

dca  ( 

=  op  =  3)  -  (M[  z]  ^  AC;  AC  ^  0)  ; 

depoeit  and  clear  AC 

jms  ( 

=  op  =  li)  -  (Htz]  ,  ,PC;  next  PC  .- z  +  1) ; 

jump  to  subroutine 

jmp  ( 

=  op  =  5)  -  (PC  -  z); 

jurrp 

iot  ( 

=  op  =  6)  ( 

u  in  out  transfer,  microfy  re 

up  to  3  pulses 

iOu^pl^bit  ~-f  lO^puIse^l  «-  1;  next 

to  an  io  device  addresee  i 

i OljP2     i  t  ->  1  O^pu  1  se^  «-  I  ;  next 

io^pit^bit  ^  lOjjulse^'i  ^1); 

opr  ( 

=  op  =  7)  -' Operate^execut i on 

:te  instruction  is  defined  beloo 

) 

-  -nction  execution 

Operate  Instruction  Set 

The  miaroprogrwmed  operate  instructions:    operate  group  Z, 

operate  group  2,  and  extended  arithnetia  are  defined 

as  a  separate 

instruction  set. 

Operate^execut i on  :=  { 

cla  ( 

=  i<^>  =  I)  ^  (AC  ^  0); 

clear  AC.    Cormon  to  all  operate  instructions. 

opr^l 

(:=  i<3>  =  0)  _  ( 

operate  gi^oup  3 

cl 

(:=  i<S>  =  I )  ->  (L      0)  ;  next 

\i  clear  link 

cma   (:=  i<6>  =  1)       (AC          AC)  ; 

u  cofrjplement  AC 

cm 

(:=   i<7>  =   1)   _  (L           L):  next 

u  complement  L 

iac   (:=  i<l  1>  =  1)   -  (LnftC  -  LoAC  +1);  next 

incye^^ient  AC 

ra 

(:=  i<8:10>  =  2)  ^  (LoAC  -  LoAC  x  2  {rotate}); 

u  t'ctaze  left 

rt 

{:=  i<8:10>  =  3)  -»  (LDAC  -  LDAC  X  2^  [rotate^); 

li  rotate  tLJice  left 

rar   (:=  i<8:i0>  =  1( )  -  (LOAC  ^  LDAC  /  2  {rotate}); 

u  I'O  ta  t  e  righ  t 

rtr  (:=  !<8:10>  =  5)       (LOAC  •- LOAC  /  2^  frotatet)) 

;          u.  rotate  twice  right 

opr  J 

(:=  I<3,1I>  =   10)  -»  { 

operate  group  2 

ski 

p  condition  ~  (i<8>  =  l)  -  (PC  -PC  +  l) ;  next 

u  ACjL  skip  test 

skip  condition   :==  ( (sma  \   (AC      0))  v   (sza  A  (AC 

=  0))  V   (sni  A  D) 

OS  r 

(:=  i-  3:-  =  1)  _  (AC  .-ACV  Oata  switches); 

n  "cr"  s'jitahes 

hit 

(:=   \<]0-'=  1)  -  (Run  -  0)); 

li.  halt  or  stop 

EAE  (• 

=   i<3.11~  =   '0    -*EA£^instruction  ^xecut  i  on) 

ottional  EAE  description 

Part  1  I  The  structure  of  computers 


additional  conventions  into  the  language,  e.g.,  list  the  instructions 
in  a  table  with  their  mnemonic  names  in  a  special  column,  rather 
than  write  the  whole  affair  as  an  expression.  (In  fact,  if  you  ex- 
amine the  first  page  of  Fig.  2,  you  will  note  that  the  entire  descrip- 
tion of  the  PDP-8  Pc  is  a  single  expression.)  The  reason  is  that 
although  many  processors  fit  such  a  format  very  well,  not  all  do 
so,  e.g.,  microprogrammed  machines.  By  making  the  ISP  descrip- 
tion a  general  expression  for  evoking  action-sequences,  we  obtain 
the  generality  we  need  to  cover  all  the  variations.  We  will  have 
two  examples  with  the  PDP-8  itself:  the  microprogrammed  feature 
and  the  fact  that  the  interpretive  cycle  simply  becomes  part  of 
the  total  expression  for  the  behavior  of  the  processor. 

Let  us  now  consider  the  action-sequence.  We  use  standard 
mathematical  infix  notation.  Thus  we  write 

AC  ^  AC  A  M[z] 

This  indicates  that  the  word  in  Mp  at  address  z  is  ANDed  with 
the  accumulator  and  the  result  left  in  the  accumulator.  It  is  as- 
sumed that  the  operation  designated  by  A  is  well  understood.  (The 
<— ,  of  course,  is  the  transmit  operation.)  Each  processor  will  have 
a  basic  set  of  operations  that  work  on  data-tvpes  of  the  machine. 
Here  the  data-type  is  simply  the  12-bit  word  viewed  as  an  array 
of  bits. 

Operators  need  not  involve  memories  actually  within  the  Pc 
(the  processor  state).  Thus, 

Mp[z]  ^Mp[z]  +  1 

expresses  a  change  in  a  word  in  Mp  directly.  That  this  must  be 
mechanized  in  the  PDP-8  by  means  of  some  register  in  Pc  is 
irrelevant  to  the  ISP  description. 

We  also  use  fimctional  notation;  for  example, 

AC  ^  abs(AC) 

replaces  the  contents  of  the  AC  with  its  absolute  value.  When 
an  action  has  an  unspecified  function  or  operation  we  generally 
write 

A  ^  f(A,B,  ...)       or       A*-uB       or  A<-BbC 
for  fimction,  unary  operation,  and  binary  operation,  respectively. 

Effective-address  calculation  process.  In  the  examples  just  given 
we  used  z  as  the  address  in  Mp.  This  is  the  effective  address  and 


is  defined  as  a  conditional  expression  (in  the  manner  of  ALGOL 
or  LISP): 

z<0:ll>  :=  ( 
— I  ib  — >  z"; 

ib  A  (lOg  <  z"  <  17g)      (M[z"]  ^  M[z"]  -|-  1);  next 
ib^  M[z"]) 

The  right  arrow  (-^)  is  analogous  to  the  conditional  sign  used  in 
the  main  instruction,  equivalent  to  the  "if  .  .  .  then  .  .  ."  of 
ALGOL.  The  parentheses  are  used  to  indicate  grouping  in  the 
usual  fashion.  However,  we  arrange  expressions  on  the  page  to 
make  reading  easier. 

As  the  expression  for  z  shows,  we  permit  conditionals  within 
conditionals  and  also  the  nesting  of  definitions  (z  is  defined  in  terms 
of  z").  Again,  we  should  emphasize  that  the  structure  of  such 
definitions  may  reflect  the  underlying  hardware  organization,  but 
it  need  not.  When  describing  existing  processors,  as  in  this  book, 
the  ISP  description  often  reflects  the  hardware.  But  if  one  were 
designing  a  processor,  the  ISP  expressions  would  be  stated  as 
design  objectives  for  the  RT  structure,  and  the  latter  might  differ 
considerably. 

Special  note  should  be  taken  of  the  opr  instruction  (op  =  7) 
in  Fig.  2,  since  it  provides  a  microprogramming  feature.  There 
are  two  separate  options  depending  on  instruction's)  being  0  or 
1.  But  common  to  both  is  the  operation  of  clearing  the  AC  (or 
not),  associated  with  instruction'4).  Then,  within  one  option 
(instruction's)  =  0)  there  are  a  series  of  independently  executable 
actions  (following  the  clearing  of  L);  within  the  other  (instruc- 
tion's) =  1),  there  are  three  independently  settable  control  ac- 
tions. The  nested  conditionals  and  the  use  of  "next"  to  force  se- 
quential behavior  make  it  easy  to  see  exactly  what  is  going  on 
(in  fact  a  good  deal  easier  than  describing  it  in  natural  language, 
as  we  have  been  doing). 

The  instruction  interpreter.  We  now  have  all  the  instructions 
defined  for  the  PDP-8,  including  the  effective-address  computation 
(z).  It  remains  to  define  the  interpreter.  From  a  hardware  point 
of  view,  an  interpreter  consists  of  the  mechanisms  for  fetching  a 
new  instruction,  for  decoding  that  instruction  and  executing  the 
operations  so  designated,  and  for  determining  the  ne.\t  instruction. 
A  substantial  amount  of  this  total  job  has  already  been  taken  care 
of  in  the  part  of  the  ISP  that  we  have  just  explained.  Each  instruc- 
tion carries  with  it  a  condition  that  amounts  to  one  fragment  of 
the  decoding  operation.  Likewise,  any  further  decoding  of  the 
instruction  that  might  be  done  in  common  by  the  interpreter 


Chapter  2  |  The  PMS  and  ISP  descriptive  systems  29 


(rather  than  by  the  individual  operation  circuits)  is  impHed  in  the 
expressions  for  each  instruction,  and  by  the  expression  for  the 
effective  address.  The  only  thing  that  is  left  is  to  fetch  the  next 
instruction  and  to  execute  it. 

In  a  standard  machine,  there  is  a  basic  principle  that  defines 
operationally  what  is  meant  by  the  "next  instruction."  Normally 
the  current  instruction  address  is  incremented  by  1,  but  other 
principles  are  used  (e.g.,  on  a  processor  with  a  cyclic  Mp).  In 
addition,  several  specific  operations  exist  in  the  repertoire  that  can 
affect  what  program  is  in  control.  The  basic  principle  acts  like 
a  default  condition:  If  nothing  specific  happens  to  determine 
program  control,  the  normal  "next  "  instruction  is  taken.  Thus,  in 
the  PDP-S  we  get  an  interpretation  process  that  is  essentially  the 
classic  fetch-execute  cycle  (ignoring  internipts): 


Run      (instruction  ^  M[PC];  PC  «—  PC  +  1;  next  fetch 

Instruction„e.vecution)  execute 

The  sequence  is  evoked  so  long  as  Run  is  true  (i.e.,  its  bit  value 
is  I).  The  processor  will  simplv  cvcle  through  the  sequence,  fetch- 
ing and  then  executing  the  instruction.  In  the  PDP-8  there  exists 
a  halt  operation  that  sets  Run  to  be  0,  and  the  console  kevs  can, 
of  course,  stop  the  computer.  It  should  be  noted  that  the  ISP 
descriptions  in  this  book  do  not,  generally,  include  console  behavior. 

.'\  state  diagram  (Fig.  3)  is  useful  to  represent  the  behavior  of 
the  instruction-interpretation  process.  .\s  an  instruction  is  inter- 
preted, the  system  moves  from  state  to  state.  ."Vny  of  the  states 
can  be  null,  in  which  case  a  simple  transition  is  to  be  made  to 
the  successor  of  the  null  state.  The  K(instruction  interpreter)  con- 


State  name  Time  in  o  state  Meoning 

soq/oq  toq  Operation  To  determine  the  instruction  q 

soq/oq  toq  Access  (to  Mp)  for  the  instruction  q 

so.  o/o.o  to.o  Operation  to  decode  the  operation  of  q 

sov.r/ovr  tov.r  Operation  to  determine  the  variable  address  v 

sav.r/avr  tav.r  Access  {to  Mp)  read  the  voriobie  v 

so/o  to  Operotion  specified  m  q 

sov.w/ovw  tov.  w  Operotion  to  determine  the  variable  address  ' 

sov.w/avw  tav. w  Access  (to  Mp)  to  write  variable  v 


Fig.  3.  ISP  interpretation  state  diagram. 


Part  1     The  structure  of  computers 


trols  these  movements  according  to  the  information  in  the  instruc- 
tion. Which  states  are  null  and  which  of  multiple  alternative 
transitions  occur  depend  on  the  instruction  being  interpreted. 

Within  each  state,  various  operations  are  carried  out,  under 
the  control  of  subordinate  K's.  Note  that  the  upper  states  in  Fig. 
3  are  controlled  by  the  Mp  whereas  the  lower  ones  are  controlled 
by  the  Pc.  We  have  tried  to  use  a  simple  mnemonic  scheme  to 
label  these  states:  o  for  operation,  q  for  instniction,  a  for  access, 
r  for  read,  and  w  for  write.  Similarly,  we  prefix  the  state  with  t 
to  indicate  the  time  duration  of  the  state,  and  we  may  prefix  the 
state  by  s. 

Figure  3  is  somewhat  more  detailed  than  is  usual.  We  will  use 
it  in  Chap.  3  to  describe  a  number  of  different  processors.  However, 
the  figure  simplifies  the  familiar  fetch-execute  cycle: 

Fetch:      {oq,  aq} 

t. fetch  =  toq  +  taq 
Execute:  {oo,  ov.r,  av.r,  o,  ov.w,  av.w} 

t. execute  =  too  +  tov.r  -|-  tav.r  -I-  ■  •  •  -|-  tov.r 

-I-  tav.r  -I-  •  •  ■  -I-  to  -(-  tov.w  -|-  tav.w 

Consider,  by  way  of  example,  the  tad  instruction  of  the  PDP-8, 
using  the  general  state  diagram  of  Fig.  3.  From  the  ISP,  the  net 
effect  is 

Run      (instruction  <-  M[PC];  PC      PC  -I-  1;  next 

tad  (:=  op  =  1)^  (LDAC  ^LDAC  -f-  M[z])) 

where 

z<0:ll>  :=  (specifies  the  ejfective-address  calculation  process) 

The  state  diagram  has  more  detail  to  explain  the  computer's 
behavior  with  respect  to  timing  and  its  temporary  registers.  (Note 
a  complete  .state  diagram  for  the  physical  PDP-8  is  given  in  Fig. 
11  of  Chap.  5.)  The  actual  state  table  appears  on  page  31. 

Notice  again  that  the  ISP  description  does  not  determine  the 
way  the  processor  is  to  be  organized  to  achieve  this  sequencing 
or  to  take  advantage  of  the  fact  that  many  instnictions  lead  to 
similar  sequences.  All  it  does  is  specify  unambiguously  what  oper- 
ations must  be  carried  out  for  a  program  in  Mp.  The  ISP  descrip- 
tion does  specify  the  actual  format  of  the  instniction  and  how  it 
enters  into  the  total  operation,  although  sometimes  indirectly.  For 
example,  in  the  case  of  the  and  instruction  (op  =  0),  the  definition 
of  AC  shows  that  the  AC  does  not  depend  on  the  instniction,  and 
the  definition  of  z  shows  that  z  depends  on  other  fields  of  the 


instruction  (indirect„bit,  page„()„bit,  page^address).  Likewise,  the 
form  of  the  ISP  expression  shows  that  AC  and  PC  both  enter  into 
the  instruction  implicitly.  That  is,  in  the  ISP  description  all  de- 
pendence on  memory  is  explicit.^ 

Data-types  and  data-operations 

This  completes  the  description  of  the  ISP  for  the  PDP-8.  For  more 
complex  machines  the  number  of  data-tvpes  and  the  operations 
on  them  are  much  more  extensive.  Then  the  data-tvpes  mav  be 
declared  independently  of  the  instruction  set,  in  the  same  manner 
as  we  declared  memorv. 

In  fact,  the  one  major  piece  of  organization  in  the  stnicture 
of  processors  at  the  ISP  level  that  has  not  appeared  in  our  example 
involves  the  data-types.  Each  data-type  has  a  set  of  operations 
that  are  proper  to  it.  Add,  subtract,  multiply,  and  divide  are  all 
proper  to  any  numerical  data-type,  as  well  as  absolute  value  and 
negation.  Not  all  of  these  need  exist  in  a  computer  just  because 
it  has  the  data-type,  since  there  are  several  alternative  bases,  as 
well  as  some  levels  of  completeness.  For  instance,  notice  that  the 
PDP-8  first  of  all  does  not  have  multiply  and  divide  (unless  one 
has  its  special  option),  thus  having  a  relatively  minimal  level  of 
arithmetic  operations,  and  second,  it  does  not  have  a  subtract 
operation,  using  a  two's  complement  add,  which  permits  negation 
(  —  AC)  to  be  accomplished  by  complementation  (— |AC)  followed 
by  add  1.  Still,  the  options  are  rather  few,  provided  one  has  de- 
cided to  include  a  given  data-type  in  the  repertoire.  In  the  Ap- 
pendix at  the  end  of  the  book  are  given  with  each  of  the  data-types 
(or  classes  thereof)  the  sets  of  operations  that  are  proper  to  that 
data-type. 

The  PDP-8,  for  example,  does  not  have  several  data  representa- 
tions for  what  is,  externally  considered,  the  same  entity.  An  oper- 
ator that  does  a  floating  add  and  one  that  does  an  integer  add 
are  not  the  same.  However,  we  will  denote  both  by  the  same 
symbol  (in  this  case,  -I-),  indicating  the  difference  parenthetically 
after  the  expression.  Alternativelv,  the  specification  of  the  data 
type  can  be  attached  to  the  data.  Thus,  in  the  IBM  7094  we  have 
the  instnictions 

'This  is  not  correct,  actually.  In  physically  realizing  an  ISP  description, 
additional  memories  may  be  utilized  (they  may  even  be  necessary).  It  can 
be  said  that  in  the  ISP  description  these  memories  are  implicit.  However, 
a  consistent  and  complete  description  of  an  ISP  can  be  made  without  use 
of  these  additional  memories  whereas  with,  say,  a  single-address  machine 
it  does  not  seem  possible  to  describe  each  instniction  without  some  refer- 
ence to  the  implicit  memories — as  we  see  in  the  effective-address  calcula- 
tion procedures  where  definitions  look  much  like  registers. 


Chapter  2  |  The  PMS  and  ISP  descriptive  systems  31 


States 

Time 

ISP  effect 

Operational  description 

soq 

\ 

toq 

MA  <-  PC; 

PC  ^  PC  +  1 

Calculate  the  address  of  the  instruction,  q,  and  calculate  the  address  of  the  next 
instruction,  q  +  1.  The  address  is  stored  in  the  address  register,  MA,  used 
to  control  the  access. 

S.fetch 
saq  1 

taq 

MB  ^  M[MA] 

Fetch  the  data  from  memory  location,  M[MA]  (i.e.,  essentially  M[PC]),  and  place 
the  result  in  a  buffer  (temporary)  register. 

soo 

too 

IR  ^MB<0:2> 

Calculate  and  decode  the  instruction. 

sov.r 
S.exec 

ute 

tov.r 

MA  ^f(MB,IR) 

Calculate  the  address  of  the  data. 

sav.r 

tav.r 

MB  <-  M[MA] 

Fetch  the  data  from  Mp. 

so 

to 

L  □  AC  ^  L  □  AC  +  MB 

Do  the  operation  specified  by  the  instruction. 

Add^(AC<-AC  +  M[e]); 

Add  and  carry  logical  word/ACL^  ( 

AC  *— AC  +  M[e]  {unsigned. integer}); 
Floating  add/FAD  ^  (AC      \C  +  M[e]  {.sf}); 
Unnormalized  floating  add/UFA      (AC  *-  AC  4-  M[e]  {suf}); 
Double-precision  floating  add/DFAD  — >  ( 

ACMQ«-ACMQ  +  M[e]nM[e  +  1]  {df}); 
Double-precision  unnormalized  floating  add/DUFA  — >  ( 

ACM(^^ACMQ  -I-  Mfe]  □  M[e  +  1]  {duf}) 

The  first  one,  without  a  special  indicator  of  data-type,  is  taken 
to  be  integer  addition;  the  next,  unsigned  integer;  the  next,  single 
precision  floating  point;  the  next,  unnormalized  single  precision 
floating  point;  the  next,  double  precision  floating  point;  and  the 
la.st,  unnormalized  double  precision  floating  point,  .\lthough  there 
are  often  clues  that  could  be  used  to  infer  which  form  of  addition 
is  being  defined  (e.g.,  double  precision  takes  two  words)  we  label 
all  but  the  integer  operation. 

We  use  braces  {  }  to  differentiate  which  operation  is  being 
performed  in  the  above  examples.  Thus,  above,  the  data-type  is 
enclosed  in  braces  and  refers  to  all  the  memory  elements  (oper- 
ands) of  the  expression.  Alternatively,  we  use  braces  as  a  modifier 
on  any  memory  to  signify  the  information  meaning.  For  example, 
a  fixed  point  to  floating  point  data-conversion  operation  would  be 
given  as 

AC{floating)  ^AC{fi,xed} 


We  also  use  braces  as  a  modifier  for  the  operation-type.  For  exam- 
ple, shifting  (left  or  right)  can  be  a  multiplication  or  division  by 
a  base,  but  it  is  not  always  an  arithmetic  operation.  In  the  PDP-8, 
for  instance,  we  have 

L  □  AC  <-L  □  AC  X  2  {rotate} 

where  the  end  bits  L  and  .\CX11)  are  connected  when  a  shift 
occurs  (the  operator  is  also  referred  to  as  a  circular  shift). 

In  general,  the  nature  of  the  operations  used  in  processors  are 
sufficiently  familiar  to  the  computer  professional  that  no  definitions 
are  required,  and  they  can  all  be  taken  as  primitive.  It  is  necessary 
only  to  have  agreed  upon  conventions  for  the  different  data  repre- 
sentations used.  The  .\ppendix  provides  the  basic  abbreviations. 
In  essence,  a  data-tvpe  is  made  up  recursively  of  a  concatenation 
of  subparts,  which  themselves  are  data-types.  This  concatenation 
ma\'  be  an  iteration  of  a  data-type  to  form  an  array.  Fig.  4  shows 
the  structure  of  various  data-types  and  how  each  is  built  from  more 
primitive  data-types. 

If  required,  an  operation  can  be  defined  in  terms  of  other 
(presumably  more  primitive)  operations.  It  is  necessary  first  to 
define  the  data  format  explicitly  (including  perhaps  some  addi- 
tional memorv).  Variables  for  the  operands  are  permitted  in  the 
natural  way.  For  example,  binary  single-precision  floating-point 
multiplication  on  a  .36-bit  machine  could  be  defined  in  terms  of 
the  data  fields  as  follows: 


32  Part  1     The  structure  of  computers 


V  1    ■(^R^4-i*"vc^  ' 


V 


Scolor  {or  string)  -  1  element 

V  I 

'  stacks  Linked    Vector  n  elements  ( lir>eor  list, 
Queues        lists     /  table, 1  dimensional  orroy  ) 
MotriK-num  elements  (  2  dir 
/        sionol  arroy) 
n  dimensional  array 
d^  X  d2  X     X  dn  elements 
Simple  multiple  type  structures 


Tare  normally  considered 
non  -decomposoble 
primitives 


Fig.  4.  Common  data-types  recognized  by  processor  hardware. 


=  <0:27> 


=  <28:3: 
=  <28> 


> 


sf  mantissa/mantissa 
sf  sign/sign 
sf  exponent/exponent 
sf  exponent^sign 
x]^1x2  X  x3{sf}:=  ( 

xl  mantissa  (:  =  x2  mantissa  X  x3  mantissa; 
xl  exponent  :=  x2  exponent  +  x3  exponent; 
'        next  xl  'i=  normalize  (xl)  (sf }) 

where  normalize  is 


xl:=  normalize(x2)  {sf}  :=  ( 

()^^  mantissa  =  0)  ^  (xl  exponent  :  =  0); 
((x2  mantissa  ^0)  A  (x2<0>  =  x2<l»)  ( 
xl  mantissa  :=  x2  mantissa  X  2; 
xl  exponent  :  =  x2  exponent  —  1;  next 
_  xl  :  =  normalize(x2)  {sf 


Three  additional  aspects  need  to  be  noted  with  respect  to  data- 
types: two  substantive  and  one  notational.  First,  not  everything 
one  does  with  an  item  of  data  makes  use  of  all  the  properties  of 
its  data-type.  For  example,  numbers  have  to  be  moved  from  place 
to  place.  This  operation  is  not  a  numerical  operation  and  does 
not  depend  on  the  item  being  a  number.  In  fact,  for  the  purpose 
of  data  transmission,  the  item  is  only  a  word  (assuming  it  fits  into 
a  single  word)  and  can  be  treated  as  such.  Second,  one  can  often 
embed  one  kind  of  operation  in  another,  so  as  to  coalesce  data- 
types. We  saw  this  to  a  small  extent  in  the  example  above  of  the 
FDP-8  arithmetic  operations.  A  more  pervasive  example  is  encod- 
ing the  Mp  addresses  into  the  same  integer  data-type  as  is  used 
for  regular  arithmetic.  Then  there  need  be  no  separate  data-type 
for  addresses.'  The  upshot  of  both  these  aspects  can  be  seen 
below  where  we  present  an  outline  structure  of  data-types  that 
shows  how  one  data-type  can  be  embedded  in  another  for  various 
purposes. 

Data-types  embedded  in  other  data-types  for  common  operations 

word 

integer 

fraction 
mixed 

unsigned  integer 
address  integer 
boolean  vector 

boolean  (single  bit) 

integer  sign  (divide  or  multiply  bv  two  operations) 
field 

single  precision  floating 

single  precision  unnormalized  floating 
double  word 

double  precision  integer  " 

fraction 

mixed 

double  precision  floating  point 

double  precision  unnormalized  floating  point 
character  string 
digit  string 

'  However  logical  such  a  course  may  seem,  it  is  not  always  done  this  way. 
For  example,  the  IBM  7090  (and  other  members  of  that  family)  have  a 
15-bit  address  data-type  and  a  36-bit  integer  data-type,  with  separate 
operations  for  each. 


Chapter  2     The  PMS  and  ISP  descriptive  systems  33 


The  notational  aspect  is  our  use  in  ISP  of  a  mnemonic  abbre- 
viation scheme  for  data-types.  We  have  already  used  sf  for  single 
precision  floating  point.  More  generally,  as  Table  1  shows,  an 
abbreviation  is  made  up  of  a  letter  giving  the  precision,  a  letter 
giving  the  name,  and  a  letter  giving  the  length.  A  full  treatment 
can  be  found  in  the  Appendix. 

The  simple  naming  convention  does  not  take  into  account  all 
that  is  known  about  a  data-type.  The  information  carrier  for  the 
data  is  only  partiallv  included  in  the  length  characteristic.  Thus 
the  carrier  should  also  include  the  data  base  and  the  sign  conven- 
tion for  representing  negative  nimibers.  The  common  sign  con- 
ventions are  sign  magnitude,  tnie  complement  (i.e.,  two  s  comple- 
ment for  base  2),  and  radi.\-l  complement  (i.e.,  one's  complement 
for  base  2). 

For  each  of  the  data-types  the  processor  must  have  the  implied 
operators.  In  fact,  being  able  to  represent  a  particular  entity  is 
useful  only  if  particular  transformations  can  be  carried  out  on  the 
entitv.  The  most  primitive  operation  is  data  movement  (i.e.,  trans- 
mission). Data  movement  can  be  thought  of  as  a  comple.v  operation 
consisting  of  accessing  (locating),  reading,  and  writing.  Data-types 
which  represent  numbers  require  the  ability  to  perform  the  arith- 
metic operations  -t- ,  — ,  X ,  /,  abs  (  ),  sqrt,  max,  min,  etc.  The 
address  integer  is  a  special  case  of  an  arithmetic  quantity,  and 
often  only  additive  arithmetic  operations  (-1-  and  — )  are  available 
for  it.  Boolean  scalars  (or  vectors)  require  some  subset  of  the  16 
logical  operations  (sufficient  subsets  are  — |,  A  or  — |,  V).  When 
character  strings  are  represented,  the  concatenation,  deletion,  and 
transmission  operations  are  required.  Alternatively,  we  can  look 
to  string  processing  languages  like  SNOBOL  or  COMIT  to  see  the 
operations  they  require.  If  the  strings  also  represent  numeric  quan- 
tities, then  the  arithmetic  operations  are  necessary.  Almost  all 
arithmetic  and  symbolic  data  require  relational  operations  be- 
tween two  quantities,  yielding  a  boolean  result  (tnie  or  false). 
These  relational  operators  are  =  and  but  for  arithmetic  quanti- 
ties includes  >,  >,  <,  <.  The  more  complex  stnictured  data- 
types (e.g.,  vectors  and  arrays)  also  have  a  range  of  certain  primi- 
tive operations  such  as  scalar  accessing  and  transmission.  Typical 
operations  of  vectors  are  search  and  element-bv-element  compare 
operations. 

Relationship  between  PMS  and  ISP 

In  the  introduction  to  this  chapter  we  discussed  briefly  the  rela- 
tionship between  PMS  and  ISP.  With  the  two  described,  we  can 
now  be  more  precise.  There  are  really  two  questions  here.  First, 
where  do  these  two  descriptive  systems  fit  in  with  respect  to  the 
general  hierarchical  view  of  computer  structures  discussed  in 


Table  1     Abbreviations  used  to  name  data-types 


Prt'chiun 

Datci-ttjjjc-iuiinc 

Len0h-tijpe 

fractional  f 

boolean  b 

"scalar 

quarter  q 

sign 

vector  v 

tialf  fi 

decimal  digit  digit  d 

matrix 

"single  s 

octal  digit  octal  o 

array 

double  d 

character  char  ch  c 

string/St 

triple  t 

byte  by 

quadruple  q 

syllable 

m u Iti  pi 6  m 

Wljr  U  W 

+  integer  (eq.  10) 

/ 

signed  integer  i 
unsigned  integer  ui 
fraction  fr 
fixed  mixed  mx 

floating  real  f 

unnorma lized-f loafing  uf 
complex  real  complex  cx 

Examples: 

w  word 

bv  boolean  vector 

i  integer 

sfr  single  precision  fraction 

mx  mixed 

di  double  integer 

lOd  10  decimal  digit  (scalar) 

3.ch  3  character  (scalar) 

ch.st  character  string 

sf  single  precision  floating 

suf  single  precision  unnormalized  floating 

df  double  precision  floating 

duf  double  precision  unnormalized  floating 


'May  be  optionally  omitted  from  name 


Chap.  1.  Second,  what  is  the  relationship  between  a  PMS  diagram 
of  a  processor  and  the  ISP  of  that  same  processor.  The  questions 
are  related,  but  each  is  best  answered  separately. 

\\'ith  respect  to  the  first  question,  the  PMS  system  describes 
the  topmost  system  level  (recall  Fig.  1  of  Chap.  1),  above  the 
programming,  logic,  and  circuit  levels.  It  lacks  a  characteristic  that 
all  these  other  levels  share,  namely,  that  of  provichng  a  complete 
description  of  the  computer  s  performance.  The  programming 
manual  (with  timing)  tells  everything  that  is  significant  about  the 
performance  of  the  computer  (if  it  runs  error-free).  The  same  is 
trtie  of  the  full  description  at  the  register-transfer  level,  the  logic- 
circuit  level,  and  on  down  to  the  electrical  circuit  level.  But  the 
PMS  level  is  only  an  approximate  description,  from  which  only 
certain  aspects  of  the  system's  performance  can  be  calculated. 


Part  1     The  structure  of  computers 


The  ISP  does  not  constitute  a  distinct  system  level.  Rather,  it 
describes  the  interface  between  two  levels,  the  register-transfer 
level  and  the  programming  level.  It  is  used  to  define  the  compo- 
nents of  the  programming  level — instructions,  operations,  and 
sequences  of  instructions — in  terms  of  the  next  lower  level.  In 
principle,  and  usually  in  fact,  the  language  of  the  lower  level  is 
used  to  describe  the  components  and  modes  of  connections,  one 
level  up.  In  many  ways  ISP  is  a  register-transfer  language  (in 
symbolic  rather  than  graphical  form — but  as  we  noted  in  Chap. 
1.  there  appear  always  to  be  two  such  isomorphic  notations  at 
each  system  level).  However,  ISP  has  been  extended  by  allowing 
the  instruction-expression  to  be  a  general  linguistic  expression  for 
a  computation,  just  as  if  ISP  were  FORTRAN  or  ALGOL.  This 
is  what  permits  us  to  talk  of  ISP  as  not  necessarily  determining 
the  exact  set  of  physical  registers  and  transfer  paths.  The  instruc- 
tion-expressions describe  the  functions  to  be  performed  without 
entirely  committing  to  the  RT  structure. 

If  the  ISP  is  the  interface  language  between  the  RT  and  pro- 
gramming levels,  what  is  its  relationship  to  PMS,  which  is  one 
level  above?  Every  PMS  component  has  associated  with  it  a  set 
of  operations  and  a  control  structure  for  getting  those  operations 
executed  in  connection  with  the  arrival  of  various  external  signals. 
As  we  noted  earlier  in  the  chapter,  there  is  an  ISP  description 
for  each  operation  in  its  context  of  control.  That  is,  ISP  is  the 
interface  language  for  describing  all  PMS  components  in  terms 
of  the  register-transfer  level,  not  just  P.  It  happens  that  only  one 
of  these  PMS  components,  the  processor,  carries  with  it  an  entire 
new  systems  level — the  programming  level.  All  the  other  compo- 
nents have  no  analog  of  the  programming  level  and  interface 
directly  to  the  register-transfer  level  (or  even  in  simple  cases  to 
the  logic-circuit  level).  Precisely  because  of  the  simplicity,  we  have 
not  bothered  to  develop  ISP  descriptions  of  other  components  of 
components  other  than  processors. 

The  second  question,  namely,  the  relation  between  the  ISP  and 
PMS  descriptions  of  the  same  processor,  arises  from  the  ability 
to  represent  PMS  components  recursively  as  PMS  structures  made 
up  from  more  elementary  PMS  components.  Thus,  Mp(32  kw,  16  b) 
can  be  considered  as  compounded  of  32k  memories,  M(l  w,  16  b), 
with  an  addressing  switch,  S.random.  Indeed,  if  one  carries  this 
to  the  limit,  where  the  M  s  are  single  bit  memories  (flip-flops), 
the  S's  are  one  bit  gates,  a  couple  of  specific  K"s  are  defined  for 
AND  and  OR,  etc.,  then  it  is  possible  to  draw  a  PMS  diagram 
isomorphic  to  any  logic  circuit.  Thus,  a  processor  (P)  can  be  rep- 
resented as  a  PMS  involving  M  s,  K"s,  D's,  S's,  etc.,  and  at  varying 
levels  of  detail.  Since  we  also  have  a  description  of  this  same  P 
in  ISP,  it  is  appropriate  to  consider  the  correspondence. 


First  of  all,  every  memory  in  the  ISP  description  corresponds 
to  a  memory  in  the  PMS  description.  The  data  operations  in  ISP 
imply  corresponding  D's  in  PMS  and  every  occurrence  of  transmit 
{<—)  implies  a  corresponding  link  between  the  M's  and  D's  on  the 
right  hand  side  and  the  M  on  the  left,  being  written  into.  That 
the  instructions  of  the  ISP  are  evoked  only  under  certain  condi- 
tions implies  that  a  control  (K.operation-decode)  exist  in  the  PMS 
structure.  Similarly,  the  simple,  two-state  stored-program  model 
(instruction-fetch,  instruction-execute)  for  the  interpreter  implies 
an  interpreter  control  (K. interpreter).  The  action-sequence  of  each 
instruction,  if  it  contains  any  semi-colons  or  next's,  requires  addi- 
tional K  and  possibly  additional  M  (if  the  structure  involves  em- 
bedded operations  such  as  (A  -)-  B)  X  (C  -I-  D)).  Thus  for  every 
ISP  component  there  is  an  implied  component  in  the  PMS  struc- 
ture of  the  processor. 

The  PMS  diagram  model  for  a  computer  shown  initially  on  page 
17  has  the  "natural  units  "  implied  bv  the  ISP  description  (with 
the  exception  of  the  instniction  format  part)  as  suggested  on  page 
24.  The  data-operations  D  are  therefore  implied  each  time  an 
operation  is  written.  Each  process  implies  a  control  which  we 
lump  into  the  single  K  of  the  figure.  The  model  also  shows  both 
the  arrival  of  instructions  and  the  flow  of  data  between  the  proc- 
essor (P)  and  memory  (Mp). 

There  are  several  memories  within  Pc  which  are  not  explicitly 
shown  on  page  17.  These  include  temporary  memory  within  D 
and  the  K  for  carrying  out  complex  arithmetic  operations.  The 
interpreter  control  has  temporary  memory,  of  course.  Finally, 
other  kinds  of  memories  have  been  omitted  to  simplifv  the  model. 
In  multiprogrammed  computers  a  mapping  control  and  memory 
would  be  used,  and  in  pipeline  or  highly  parallel  processors  there 
would  be  temporary  memory  for  various  buffering  (e.g.,  instruc- 
tions and  data).  The  Appendix  lists  the  various  memories  of  the 
processor. 

K(P),  the  control  for  the  processor  above,  controls  data  move- 
ment among  the  Mp  and  M.processor^state  and  evokes  the  data- 
operations  of  D.  Functionally,  K(P)  can  be  broken  into  several 
parts,  each  of  which  is  responsible  for  a  part  of  the  overall  instnic- 
tion interpretation  and  execution  process,  and  each  corresponds 
to  a  part  of  the  ISP  description.  This  decomposition  is  allowed 
in  PMS.  and  if  we  did  so,  each  component  would  contain  an 
independent  control  for  its  own  domain,  e.g.,  a  K(D),  K(Mp), 
K(Instraction-set  interpreter).  More  elaborate  processor  structures 
imply  having  controls  for  functions  like  multiprogram  mapping. 
The  K(Instruction-set  interpreter)  is  the  supervisory  component 
which  causes  other  processor  K's  to  be  utilized  in  a  complex 
processor.  In  an  ISP  description  of  a  C,  the  interpreter  usually 


Chapter  2  |  The  PMS  and  ISP  descriptive  systems  35 


Instruction 

address 
colculotion 


t  -  time  spent  in  o  stote 


Instruction 
execution 
(operation  on 
processor  state) 


MptO 

Mp  #1  - 
PC 


t.  cycle 


T. cycle 


instruction  fetch 

< — I — r~ 

[t.  access  ]  | 

saq    I  sour  |  sov.r 
ltoq|    Toq     Itov.r  |  tov.r 
rtq  1-  1  


I 


I 


Instruction  Doto  address 
oddress  colculation  calculation 


The  instruction  being 

interpreted  ts 

tod  — (fl-^AfM  [2 )); 


Instruction  execution 


selects  only  the  next  instruction  and  then  after  decoding  (or  exam- 
ining it)  proceeds  to  have  the  instruction  executed  by  K(instruction 
execution). 

Resource  Allocation.  At  the  PMS  level  the  concept  of  resources, 
their  uses  and  allocation,  becomes  a  major  focus  of  analysis.  This 
is  obvious  by  now  in  multiprogramming  and  multiprocessing  sys- 
tems where  many  programs  share  the  same  Mp  and  hence  must 
be  allocated  space.  But  this  holds  equally  well  at  all  levels  of 
detail. 

Bv  giving  a  resource  allocation  diagram  along  with  the  state 
diagram  (Fig.  5)  we  show  the  relationship  of  resources,  their  func- 
tion, and  time  for  the  instruction-interpretation  process.  In  Fig. 
5  the  add  instruction  for  a  simple  I  accumulator  computer  con- 
sisting of  lPc-2Mp  is  given.  The  interpretation  for  Fig.  5  in  ISP 
is  as  follows: 

1  Calculates  the  address  of  instruction  (]  in  state  soij. 
ti  -  to  =  toq. 

PC  «— PC  -f-  1;  next  (ulvance  the  program  counter 

2  The  instruction  is  fetched  (accessed)  from  Mp  in  state  saq. 
to  -  ti  =  taq. 

M. instruction  <— Mp[PC];  next 

3  The  operation  o  to  be  performed  and  the  address  part,  v, 
for  the  data  in  M. instruction  to  be  added  to  A  are  obtained 
in  state  soo  -1-  sov.r.  13  —  t,  =  too  -t-  tov.r 

M. address  «— M. instruction  <v>;  next 

4  The  data  Mp[v]  are  fetched  in  state  sav.r.  t^  —  tj  =  tav.r 
M. temporary  <— Mp[M. address];  next 

5  The  operation  part  o  of  the  instruction  is  carried  out  on 
\;  that  is,  the  actual  addition  is  performed  on  the  data 
previously  accessed  in  the  state  so.  tj  —  t^  =  to. 

A  «— M. temporary  -I-  .\;  next 

In  the  state  diagram,  each  state  represents  the  time  spent  for 
a  given  activity.  The  two  states  at  the  top  of  the  state  diagram 
(Fig.  5)  are  waiting  for  primary  memory  accesses,  and  the  three 
lower  states  represent  processor  activity  waits.  If  we  were  to 
specialize  the  state  diagram  for  the  conventional  1  address/ 
instniction  computer,  we  would  need  one  additional  state,  repre- 
senting operand  storage,  sav.w,  and  this  would  occur  after  state, 
so.  Note  that  we  have  ignored  the  operation  decoding  state,  so.o. 
Of  course,  conditional  state  transformation  paths  have  to  be  added 
to  describe  all  instructions  (e.g.,  a  complement-the-accumulator 
instruction  has  only  states  soq,  saq,  and  so).  Similarlv,  we  could 


Fig.  5.  State  and  resource  allocation  diagram  for  a  lPc-2Mp  add  instruc- 
tion-interpretation process. 

make  a  more  general  state  diagram  to  handle  the  different  proc- 
essors (e.g.,  multiple  addresses  instniction,  stack,  and  general  reg- 
isters), as  shown  in  Fig.  4.  At  the  PMS  level,  a  derivative  of  the 
state  diagram,  the  resource  allocation  diagram  is  more  useful  be- 
cause it  relates  to  the  physical  structure. 

A  resource  allocation  diagram  expresses  the  above  instruction 
activity  in  terms  of  the  time  each  unit  is  occupied  with  a  particular 
activity.  In  this  diagram  a  slightly  more  complex  computer  struc- 
ture with  two  primapt  memories  has  been  assumed.  In  the  case 
of  the  add  instniction,  the  long  memory-cycle  time  suggests  that 
two  memories  can  be  used  so  that  an  operand  be  fetched  while 
the  instruction  memory  restoration  occurs.  These  diagrams  show 
the  time  various  resources  are  utilized:  thus  performance  and 
utilization  can  be  measured. 

Resource  allocation  diagrams  can  express  other  time  scales. 
Interest  in  operating-svstem  software  analysis  is  often  in  the  ac- 
tivities on  a  longer  time  scale  of  the  resources  utilization  as  a 


Part  1     The  structure  of  computers 


function  of  various  programs  and  subprograms.  They  may  show 
Mp  memory  occupancy  in  a  multiprogrammed  environment.  Some 
other  time  scales  of  particular  interest  are  the  instruction(s),  short 
instruction  sequences  or  subprograms,  and  the  program  times.  The 
first  two  time  scales  are  influenced  predominantly  by  the  hardware, 
and  the  latter  time  scale  is  influenced  bv  software  and  the  ex- 
ternal environment. 

The  resource  allocation  diagrams  also  can  describe  the  utiliza- 
tion of  the  C's  resources  over  time  (e.g.,  throughout  the  instruc- 
tion-interpretation process)  and  provide  a  basis  for  more  detailed 
analysis  and  design. 

The  design  problem  at  the  PMS-ISP  interface  is  mainly  one 
of  resources  scheduling. 

1  A  fixed  set  of  operations  have  to  be  performed  on  the  jobs 
(here,  a  job  is  an  instruction). 

2  Each  instruction  may  create  a  few  other  small  but  definitive 
subjobs. 

3  There  can  be  a  fixed  set  of  operators  which  handle  various 
parts  of  the  operations. 

4  Jobs  (or  instructions)  enter  P  sequentially. 
We  may  ask: 

1  How  many  operators  of  each  type  do  we  have? 

2  What  is  the  scheduling  policy  for  assigning  instructions  to 
the  operators? 

3  How  many  instructions  can  be  in  P  at  one  time,  and  in  what 
order  must  the  processing  be  performed?  How  are  the  jobs 
interlocked? 

We  do  not  attempt  to  answer  the  above  questions  but  intend 
only  to  show  the  relationship  of  the  various  parts  which  define 
the  problem.  ISP  implies  a  certain  structure  (conversely,  PMS 
behavior  is  specified  in  terms  of  the  ISP  language).  A  particular 
ISP  structure  and  a  program  denote  a  certain  path  through  a  state 
space  as  specified  by  a  state  diagram.  Finally,  the  physical  re- 
sources (in  PMS)  are  constrained  to  operate  according  to  the  state 
diagram  as  expressed  by  using  a  resources  allocation  diagram.  The 


resource  allocation  diagram  can  then  be  used  to  evaluate  the 
structure's  performance  (in  PMS)  at  a  higher  level  (e.g.,  the  number 
of  instructions/second  it  executes). 


PMS  diagram 


Resource  allocation  diagram 


Summary 

The  ISP  descriptions  of  computers  are  usually  given  as  an  appendix 
to  a  chapter.  We  organize  the  description  into  the  following  units: 


Memory 
Declaration 


Formats  and 
Operators 


Interpreter  and 
the  Instruction- 
set  E.xecution 


P  State 

Mp  State 

P  Console  State 

Instruction  Format 

Data-type  Formats  and  Special  Data 

Operation  Definitions 
Effective-address  Calculation  Process 

Instruction  Interpretation  Process 
Instruction-set  and  Instruction  Execution 
Process 


The  above  description  format  conveys  a  rather  narrow-minded 
view  of  the  ISP  structure  of  computer  systems.  However,  almost 
all  present  computers  fit  easily  into  such  a  format.  We  do  not 
presume  to  say  whether  it  will  suffice  for  future  ISPs. 

With  the  introduction  given  here  and  with  the  definitions  and 
example  in  the  Appendix  at  the  end  of  the  book,  it  should  be 
possible  to  understand  all  the  PMS  diagrams  and  ISP  descriptions 
used  throughout  the  book. 


Chapter  3 

The  computer  space 

Introduction 

The  preceding  two  chapters  have  provided  a  view  of  a  computer 
system  as  an  organized  hierarchy  of  many  levels:  physical  devices, 
electronic  circuits,  logic  circuits,  register-transfer  systems,  pro- 
grams, and  PMS  systems.  We  must  remember  that  these  are  levels 
of  description  for  what,  after  all,  remains  the  same  physical  system. 
Each  higher  level  describes  more  of  the  total  system,  but  with 
a  loss  of  detail.  As  this  is  an  engineered  system,  great  care  is  taken 
that  each  level  represent  adequately  all  the  behavior  necessar\ 
to  determine  the  performance  of  the  system.  In  natural  systems 
too  there  are  often  manv  levels  of  description  (e.g.,  in  biological 
systems,  from  the  molecule  to  the  organelle  to  the  cell  to  the 
tissue  to  the  organ  to  the  organism). 

However,  in  natural  systems  we  usually  depend  on  statistics 
to  eliminate  the  details  of  lower  levels  and  permit  aggregation, 
and  thev  always  do  so  imperfectly.  In  computer  systems,  on  the 
other  hand,  the  aggregation  is  intended  to  be  perfect.  It  fails,  of 
course,  and  so  both  error  detection  and  error  correction  exist  as 
fundamental  activities  in  computer  systems.  But  these  imperfec- 
tions are  ascribed  to  the  system  itself  and  not  to  our  description 
of  it,  which  is  just  the  opposite  from  how  we  treat  natural  systems. 
Only  the  PMS  level  of  description  is  natural,  in  the  sense  of  not 
being  the  intended  result  of  the  design.  This  is  because  perform- 
ance is  defined  ultimately  at  the  programming  level.  The  aggrega- 
tions and  simplifications  that  go  into  a  PMS  description  (e.g., 
measuring  power  by  bits  per  second)  are  appro.xiniations,  just  as 
they  are  for  any  natural  system  (e.g.,  measuring  the  productivity 
of  the  economy  bv  gross  national  product!. 

We  have  provided  descriptive  systems  for  the  top  levels  of  the 
hierarchy:  the  PMS  level  and  the  ISP  level,  the  latter  defining  the 
basic  components  of  the  programming  level  in  tenns  of  the  RT 
level  just  below.  These  are  the  two  descriptions  that  are  of  most 
concern  in  the  overall  design  of  a  computer  system.  We  did  not 
define  the  lower  levels,  because  they  go  beyond  the  focus  of  this 
book.  Neither  did  we  define  the  program  level,  partly  because 
there  exists  no  uniform  description  (no  common  programming 
language)  and  partly  because  the  computer  designer  works  mosth' 
at  the  interface,  defining  the  instruction  set.  This  latter  is  what 
the  ISP  provides.' 

'An  increasingly  popular  view  is  that  the  program  and  RT  levels  (with 
ISP  in  between)  are  one,  thus  erasing  the  difference  between  hardware 


PMS  and  ISP  permit  the  description  of  an  indefinite  number 
of  computer  systems — indeed,  all  that  come  within  the  scope  of 
the  current  design  art.  (Thev  might  even  be  taken  as  a  definition 
of  what  that  current  art  is.)  Some  lO"*  ~  10'  individual  computer 
.systems  have  in  fact  come  into  existence,  each  of  which  can  be 
described  in  PMS  and  ISP.  They  are  not  all  radically  individual. 
There  are  about  1(P  types  of  computer  systems  represented,  if 
we  define  two  systems  with  the  same  Pc  to  be  of  the  same  type. 
I  By  exercising  various  options,  a  single  computer  type  could  take 
on  10'  different  forms.) 

Of  these  thousand-odd  types,  we  present  in  this  book  just  40.'- 
W  hat  sort  of  total  population  do  we  have  here?  What  does  our 
uiini.scule  sample  look  like  when  compared  with  the  whole?  More 
fimdamentally,  what  are  the  significant  aspects  of  the  computer 
systems  that  should  be  used  in  a  comparison  or  classification?  The.se 
are  the  questions  we  will  try  to  deal  with  in  this  chapter.  We  can 
be  neither  comprehensive  nor  elegant.  There  has  simpK  not  vet 
been  done  the  necessary  study  on  which  to  base  an  adecjuate 
taxonomy  of  computer  systems.  But  we  can  present  a  rough  picture 
based  on  the  common  lore  of  the  field,  filled  in  with  our  own 
predilections. 

For  any  system,  either  an  entire  computer.  C,  or  a  component, 
such  a.s  P,  M,  or  S,  it  is  convenient  to  distinguish  its  fimction,  its 
performance,  and  its  structure.  The  system  is  designed  to  operate 
in  some  task  environment;  to  accomplish  such  tasks  is  it^  function. 
How  well  it  does  these  tasks  is  its  performance.  Evaluation  of 
performance  is  normally  restricted  to  these  tasks,  .\lthough  it  is 
always  noteworthy  when  a  system  can  perform  adequately  outside 
its  specified  domain  (e.g.,  when  a  business  computer  is  also  a  good 
control  computer),  it  is  rarely  worth  noting  when  a  system  cannot 
perform  those  ta.sks  it  was  not  built  to  perform.  Thus,  fimction 
denotes  scope,  and  performance  denotes  an  evaluation  within  that 
scope. 

Structure  denotes  those  aspects  of  the  system  that  allow  it  to 
perform.  This  includes  descriptions  of  its  subcomponents  and  how 
they  are  organized.  Performance  of  subcomponents  often  may  be 
considered  structure  as  far  as  the  whole  system  is  concerned, 
especially  if  the  performance  can  be  taken  as  given.  For  example, 
early  digital  transmission-oriented  telephone  lines  came  in  two 
capacities,  ~200  bits  sec  and  ~2,000  bits/sec.  From  the  view- 
point of  the  telephone  system,  these  are  perfonnance  measures: 

and  software.  The  boundary  appears  to  us  not  quite  so  invisible.  We  take 
the  important  task  to  be  drawing  the  boundary  in  the  right  place  for  any 
specific  design. 

-Counting  each  of  the  famihes  in  Part  6  as  one  computer.  The  IBM  Sys- 
tem/360 is  actually  a  series. 


Part  1     The  structure  of  computers 


from  the  viewpoint  of  a  computer  system  with  remote  terminals, 
these  are  structural  parameters. 

Typically,  design  proceeds  in  a  context  in  which  the  fimction 
of  the  to-be-developed  system  is  taken  as  given  and  certain  struc- 
tures are  available;  the  problem  is  to  construct  a  structure  that 
achieves  adequate  performance. 

These  terms  apply  to  any  designed  system.  For  example,  con- 
sider automotive  vehicles.  Fimction  is  a  classification  by  use:  cars 
to  carry  people,  trucks  to  carry  goods,  racers  to  win  competitions, 
antiques  to  satisfy  nostalgia  and  collectors'  pride.  Performance  is 
those  aspects  of  behavior  relevant  to  fimction;  maximum  speed, 
power-to-weight  ratio,  cargo  capacity,  run  versus  not  mn  for  an 
antique,  and  so  on.  Structure  is  such  things  as  number  of  wheels, 
shape  of  the  vehicle,  stroke  volume,  and  gear  ratios.  Structure 
determines  performance,  although  from  the  standpoint  of  design, 
of  course,  causality  runs  the  other  way:  from  function  to  perform- 
ance to  structure. 

There  are,  then,  three  main  ways  to  classify  or  describe  a 
computer  system:  according  to  its  function,  its  performance,  or 
its  structure.  Each  consists  in  turn  of  a  number  of  dimensions.  It 
is  useful  to  think  of  all  these  dimensions  as  making  up  a  large  space 
in  which  any  computer  system  can  be  located  as  a  point.  In  such 
a  space  all  the  thousand  computer  types  built  to  date  constitute 
a  sparse  scatter,  clustering  (it  is  to  be  hoped)  in  various  regions 
that  make  sense  functionally  and  economically.  The  40  computer 
types  in  this  book  sample  this  larger  scatter  in  some  way,  to  give 
a  picture  both  of  the  entire  space  and  of  the  part  already  explored. 

How  many  dimensions  are  there  in  this  computer  space?  In- 
definitely many,  if  one  wants  to  locate  a  computer  with  ultimate 
precision.  In  fact,  if  one  wants  to  go  all  the  way,  one  might  as 
well  give  the  PMS  and  ISP  descriptions  (and  down  through  the 
RT,  logic,  circuit,  and  device  levels).  The  virtue  of  thinking  of 
such  a  space  is  to  abstract  to  a  small  number  of  dimensions,  and 
to  select  those  that  are  most  relevant.  Of  the  fimctions,  one  wants 
those  that  most  influence  the  design;  of  the  performance,  one 
wants  those  that  make  the  largest  difference;  of  structure  those 
that  not  only  affect  performance  but  represent  possible  design 
choices  by  the  computer  engineer.  In  addition,  one  wants  dimen- 
sions along  which  there  is  significant  variation.  Those  aspects  of 
computer  systems  which  are  common  to  all,  such  as  the  use  of 
binary  devices,  though  of  supreme  interest  are  not  part  of  the 
computer  space. 

What  are  the  dimensions  of  the  computer  space?  As  we  re- 
marked earlier,  there  is  no  sufficiently  comprehensive  theory  of 
computer  systems  to  tell  us.  Considerable  lore  has  grown  up  from 
experience  to  date  in  designing  machines.  But  at  some  point  one 
must  simply  propose  a  set  of  dimensions  and  let  them  justify 


themselves  after  the  fact.  Table  1  gives  our  set  for  function  and 
structure.  Table  .3  (page  .52)  gives  our  set  for  performance. 
Table  1  gives  only  a  single  dimension  for  computer  system  fimc- 
tion and  19  for  computer  stiiicture;  Table  3  gives  8  for  per- 
formance. However,  the  dimensions  are  not  all  independent.  Many 
of  the  stmcture  dimensions  are  highly  (though  not  perfectly) 
correlated.  Thus,  in  Table  1  we  have  put  the  structure  dimen- 
sions in  seven  horizontal  groups,  with  the  one  at  the  left-hand 
side  being  the  most  relevant.  (In  the  first  structure  group,  we 
have  also  added  two  temporal  dimensions,  since  a  strong  correla- 
tion with  time  exists.)  For  performance,  the  dimensions  form  a 
tree  structure,  where  the  higher  dimensions  are  essentially  aggre- 
gate summaries  of  the  lower  ones.  Finally,  there  is  a  general 
correlation  between  overall  performance  and  the  various  structure 
dimensions,  in  Table  1,  with  increasing  performance  as  one  moves 
down  the  dimensions.  We  have  left  off  two  important  dimensions 
because  we  do  not  have  values;  these  are  reliability  (mean  time 
between  failures  per  operation)  and  physical  size  density  (e.g., 
bits/ft'^),  both  of  which  increase  with  generation. 

With  each  dimension  we  have  indicated  the  range  of  possible 
values.  For  some  (Pc. speed,  for  example)  this  is  a  numerical  quan- 
tity. However,  for  most,  the  range  is  a  discrete  set  of  design 
choices,  which  may  or  may  not  have  a  simple  ordering.  Clearly, 
these  discrete  values  are  selections  from  a  meaningful  subspace 
of  design  choices,  but  mostly  we  do  not  know  how  to  constmct 
that  subspace.  The  values  given  are  those  that  have  arisen  in 
practice,  and  they  serve  to  classify  the  computers  in  the  book. 
Obtaining  a  more  rational  subspace  is  a  task  for  future  research. 

The  body  of  the  chapter  will  be  taken  up  with  a  discussion 
of  each  of  these  dimensions,  where  we  will  discuss  further  their 
definition,  the  basis  for  their  selection,  and  the  reasons  behind  the 
arrangements  of  Tables  1  and  3.  We  give  the  entire  set  of 
dimensions  here  at  the  beginning,  both  for  later  reference  and  to 
emphasize  the  view  of  a  single  computer  space  in  which  com- 
puter systems  can  be  located.  We  will  refer  to  Tables  1  and  3 
from  now  on  simply  as  the  computer  space  or,  more  narrowly, 
as  the  computer  structure  space,  the  computer  performance 
space,  etc. 

History 

Like  all  systems  subject  to  variation  and  selection,  computers  have 
evolved  through  time.  So  striking  and  rapid  has  been  this  evolution 
that  the  concept  of  "'generation"  has  become  firmly  embedded  in 
the  computer  engineering  culture  (to  say  nothing  of  the  marketing 
culture  and  the  view  of  the  lay  public).  It  is  at  best  an  ambiguous 
term,  having  none  of  the  sharpness  of  its  root  term  in  biological 
evolution,  where  it  is  possible  to  draw  a  strict  genealogical  tree. 


Chapter  3  j  The  computer  space  39 


Nevertheless,  the  term  is  useful  in  stressing  that  the  history  of 
computer  systems  is  not  just  a  story  of  particular  men  discovering 
or  building  particular  things,  but  of  a  somewhat  more  impersonal 
and  widespread  series  of  advances  that  have  changed  computer 
systems  radically. 

The  generations  are  best  defined  solely  in  terms  of  logic  tech- 
nology: The  first  generation  is  that  of  vacuum  tubes  ( 1945  ~  1958), 
the  second  generation  is  that  of  transistors  (1958  ~  1966),  and  the 
third  generation  is  that  of  integrated  circuits  (1966 — ).  In  fact, 
current  usage  describes  hybrid  logic  technology  machines,  such 
as  the  IBM  System/360,  as  third  generation,  and  so  this  extension 
must  be  included.  What  will  be  called  fourth  generation  is  yet 
to  emerge;  most  likelv  it  will  be  medium  and  large  scale  integrated 
circuits  with  possibly  integrated  circuit  primary  memory. 


It  is  a  measure  of  American  industry's  generally  ahistorical  view 
of  things  that  the  title  of  "first"  generation  has  been  allowed  to 
be  attached  to  a  collection  of  machines  which  were  some  genera- 
tions removed  from  the  beginnings  by  any  reasonable  accounting. 
.Mechanical  and  electromechanical  computers  existed  prior  to 
electronic  ones.  Furthermore,  thev  were  the  finictional  equivalents 
of  electronic  computers  and  were  realized  to  be  such.  Thev  were 
also  separated  by  a  wide  gap  in  performance  and  structure,  both 
from  each  other  and  from  vacuum  tube  machines.  Thus,  by  rea- 
sonable reckoning,  we  are  currently  in  the  fifth  generation  of  com- 
puters, not  the  third.  But  usage  is  now  too  well  established  to 
change. 

.Vctually.  it  was  not  always  viewed  thus.  Figure  1  reproduces 
a  genealogical  tree  of  the  early  computers  prepared  bv  the  .\a- 


Present 
generotion 


••  HARWARO'''^ 
■|.MARK  I 


Predecessors 


Fig.  1.  The  "family  tree"  of  computer  design.  The  remarkable  growth  of  electronic  computing  systems  in  the  Western  world  began  primarily  through 
government  support  of  research  and  development  in  the  universities.  The  need  for  data-processing  facilities  of  increased  capacity  inspired  further 
support  for  their  development  in  both  educational  institutions  and  private  industry.  The  current  generation  of  computers  is  predominantly  the 
result  of  development  by  private  industry.  The  tree  lists  many  of  the  machines  developed  in  these  ways.  At  the  roots  are  the  contributions  of  many 
existing  technologies  to  the  rapid  growth  from  electromechanical  to  electronic  systems.  Some  of  the  milestones  are  ENIAC  (Electronic  Numerical 
Integrator  and  Computer),  the  first  electronic  computer;  EDVAC  (Electronic  Discrete  Variable  Automatic  Computer),  the  first  internally  stored- 
program  computer  and  first  acoustic  delay-line  storage;  MADM  (Manchester  Automatic  Digital  Machine),  the  first  index  registers  (B  lines)  and  first 
cathode-ray-tube  electrostatic  storage;  MTC  (Memory  Test  Computer),  the  first  core-storage  computer.  (Courtesy  of  National  Science  Foundation.) 


40  Part  1     The  structure  of  computers 


Table  1    The  computer-space  dimensions 

Campitter  function 

Scientific 
Business 
Control 

Communications 

(switching|store  and  forward) 
File  control 
Terminal 
Time  sharing 


Logic 

Historical 

Cost /operation 

technology 

Generation  date 

Pc.  speed 

sec)  {$/bit/s) 

Mechanical 

Electromechanical 

1930 

10"' 

1000 

(Fluidics) 

(■19701 

10~^ 

Vacuum  tube 

first 

1945 

10"^ 

Transistor 

second 

1958 

10~^ 

\  1 

1  w 

Hybrid 

1964 

10^^ 

Integrated/IC 

third 

1966 

1  n-" 

U.  i 

Medium  to  large- 

fourth? 

197? 

0.01 

scale  integrated/ 

IVlol  — -  Lol 

Word  size 

Base 

Data-ti/pes 

o  D 

bi  na  ry 

word 

12  b 

decimal 

integer  address  (integer) 

16  b 

bit]bit  vector 

24  b 

instruction 

32  b 

floating  point 

48  b 

character 

64  b 

character  string 

character  (6b) 

word  vector 

character  (8b) 

vector 

matrix 

array 

lists,  stacks 

Addresses /instruction 

M. processor  state  (exclndina  prog 

am  coiinti 

r) 

0  address  (stack) 

stack 

1  address 

1  Accumulator 

1  +  X  (index)  address 

accumulator  and  index  registers 

1  +  g  (general  register)  address 

general  registers  array 

2  address 

3  address 

no  explicit  state 

n  +  1  address 
Language  determined 
Compound 
Microprogrammed 


Chapter  3  j  The  computer  space 


PMS  structure 

Su  i7(  /ii;if; 

PrucesHor  function 

IPc 

IPc(lnterrupt) 
IPc-nPIo 

1  Pc-nPloP(display) 

2C  (duplex) 

nPc(  multiprocessing) 

nPc-P(array  special  algorithm) 

nPc(parallel  processing) 

C  (network) 

Network 

l:n  (duplex) 

n:m  (time-multiple  x) 

2:n  (dual-duplex) 
n;m  (cross-point) 

n/2:n/2  (non-hierarchy) 

P. microprogram 
Pc 

Pc  (no  lo) 
Pio 

P. display 

P.array 
P.vector  move 
P.algorithm 
P.language 

Am'.v.siiif;  utg,urithui 

Mp.sizc 

Mssiu 

Mp. speed  \b/s) 

Ms.sfyeed  {h  si 

Linear  (stack) 
Linear  (queue) 
Bilinear 
Cyclic -random 
Cyclic 
Random 
Content 
Associative 

tape  (large) 

disk  (medium)  1  magnetic  card  (large)] 
drum  (large)                     drum  (small)  1  photostore  (large) 
core  (medium)                   core  (smaller) 
film  (small) 
integrated  circuit 

A  A  A  A 

>io- 

Mp  concurrcni't/ 

Iut('ri}roces.s  coiiutiuuiration 

1  program 

1  program  with  interrupts 

1  program  with  multiple  concurrent 

subprograms  (for  example,  IPc-nPio) 
Monitor  or  fixed  program(IVI)  +  1  program 
m  +  n  swapped  programs 
m  +  n  programs  (multiprogramming) 

subroutines  and  traps 
interrupts 

interprocessor  interrupts 

extracodes  (programmed  operators  for 
monitor  calls) 

4 

No  relocation 

1  segment 

2  segments  (pure,  impure) 
>2  segments 

Pages 

m  +  n  segments  with  shared  programs  intersegment  communication 

Fixed  length,  paged  segments 
Multiple-length  paged  segments 
Variable-length  segments 
Named  segments 

Processor  concurrencif 

Senalbybit  ^^^1    \,  .U<^<sJZ^ 

Parallel  by  word 

Multiple  instruction  streams,  IPc 
Multiple  data  streams  (arrays) 
1  instruction  buffer 
n  instruction  buffer 
Look-aside  memories 
Pipeline  processing 


Part  1     The  structure  of  computers 


tional  Science  Fovindation  in  1959.  Notice  that  the  Harvard  Mark 
machines,  which  were  constmcted  from  relays  (hence  electro- 
mechanical) are  accorded  the  place  of  honor  as  first  generation 
(but  Babbage  is  nowhere  to  be  seen). 

It  is  not  appropriate  to  provide  here  an  adequate  history  of 
computer  technology.  The  early  story  has  often  been  told,  starting 
with  Babbage  and  early  mechanical  calculators,  through  Hollerith 
punched  cards,  on  to  the  relay  calculators  at  Bell  Laboratories 
and  Harvard,  up  to  the  birth  of  electronic  machines  with  ENIAC, 
and  finally  to  the  stored-program  concept  with  the  von  Neumann 
machine  at  the  Institute  for  Advanced  Studies  (IAS),  EDSAC  at 
Cambridge  University,  and  EDVAC  at  the  University  of  Pennsyl- 
vania (with  the  contemporary  developments  by  ZUSE  in  Germany 
often  left  out).  And  there  have  been  a  few  scattered  attempts  to 
tell  some  of  the  story  of  the  last  three  generations.  But  to  date 
no  really  satisfactory  historical  account  has  been  given.  This  is 
due  in  part  to  recency  and  in  part  to  the  difficulties  of  evaluating 
and  sorting  out  the  significant  developments  of  a  very  complex 
technology  undergoing  rapid  growth. 

What  is  appropriate  here  is  to  view  the  evolution  of  computer 
systems  as  measured  by  the  dimensions  of  computer  space  and 
to  localize  the  examples  of  this  book  in  relation  to  calendar  time 
and  other  computers.  The  concept  of  generation  has  led  others 
to  attempt  the  same  thing  by  constructing  a  family  tree.  Fig.  1 
being  but  one  example.  But  the  relationships  between  computers 
is  not  nearly  as  simple  as  such  a  tree  implies.  VVe  prefer  to  plot 
a  straightforward  time  chart,'  as  shown  in  Fig.  2,  in  which  we  group 
the  machines  by  manufacturer  and  within  each  group,  by  ac- 
knowledged family  relationship  (for  example,  701-704-709-etc.). 
There  is  clearly  relatively  closer  kinship  within  a  company  than 

'  Whereas  we  have  checked  the  Time  Chart  numerous  times  for  accuracy, 
we  make  no  claim  about  the  number  of  errors  it  still  has.  We  have  relied 
on  the  following  source  data:  (1)  Original  papers.  These  are  mostly  shown 
on  the  chart  as  "p".  Normally  the  reader  can  infer  that  the  work  pre- 
sented in  a  paper  occurs  prior  to  the  actual  publication.  There  are  notable 
exceptions  (e.g.,  the  core  memory,  and  Atlas  papers)  which  were  first  pub- 
lished to  lay  claims  to  certain  ideas.  (2)  Historical  reviews.  Primary  his- 
torical papers  include:  Rosen  [1969]  and  Serrell  [1962].  Secondary  his- 
torical review  papers  include:  Bowden  [19.5.3],  Campbell  [1952],  Chase 
[19.52],  Nisenoff  11966],  and  Samuel  [1957].  (.3)  Encyclopedia.  (4)  Computer 
surveys.  Two  sources  have  been  used:  The  Adams  Associates  Computer 
Characteristics  Quarterly,  published  since  1960  [Adams,  196();  Adams 
Assoc.,  1966,  1967,  and  1968);  and  Martin  H.  Weik's  four  Surveijs  of 
Domestic  Electronic  Digital  Computer  Systems  [Weik,  1955:  Weik,  1961 
(third);  and  Weik,  1964  (fourth)].  The  Adams'  Charts  give  the  date  of 
first  delivery,  and  the  Weik  Survey  gives  the  date  the  computer  was  first 
operating.  (5)  Manufacturer,  organization  or  person  supplied  dates.  In  a 
few  cases  we  have  asked  directly  for  specific  operational  and  delivery 
information. 


between  companies.  One  advantage  of  such  a  time  chart  is  its 
depiction  of  the  life  history  of  a  single  system,  showing  how  long 
it  takes  for  computer  systems  to  go  from  paper  through  prototype 
to  production. 

Not  all  computer  types  are  shown  on  the  chart,  there  being 
about  250  out  of  the  estimated  1,000  types.  Lack  of  space  (and 
of  perseverance)  accounts  for  the  omissions.  The  major  LInited 
States  manufacturers,  as  well  as  some  minor  ones,  and  all  ma- 
chines of  .substantial  historical  interest  are  represented.  All  the 
machines  discussed  in  this  book  are  gathered  together  on  a  sep- 
arate line  (though  they  also  occur  elsewhere,  if  appropriate). 
Foreign  machines  are  omitted,  unless  they  are  described  in  this 
book.  In  addition,  the  machines  of  many  early  minor  manufac- 
turers are  missing  (ALWAC,  ELECOM,  etc.). 

The  second  part  of  the  time  chart  arranges  many  computers 
by  word  size,  to  give  the  reader  our  classification.  LInfortunately, 
only  a  few  samples  are  given,  owing  to  space  limitations.  Thus, 
the  density  on  the  graph  does  not  indicate  the  true  density  of 
existing  machines.  Many  small  computers,  which  are  dedicated  to 
a  particular  task,  are  beginning  to  be  built  and  a  comparatively 
small  number  of  very  large  computers  have  been  built.  On  the 
bottom  fine  line  we  place  the  machines  in  this  book. 

The  third  part  of  the  time  chart  deals  with  technology  by 
listing  events  along  various  dimensions  that  have  been  significant 
in  the  evolution  of  computers.  Besides  the  dimensions  in  the 
computer  space  we  have  also  added  some  dimensions  describing 
software  systems.  Although  we  have  not  been  able  to  deal  with 
the  programming  level  in  this  book  (except  for  the  ISP  interface), 
its  development  is  clearly  as  important  as  that  of  the  hardware, 
and  there  exists  strong  mutual  interaction  between  the  two. 

The  fourth  (and  final)  part  of  the  time  chart  gives  selected 
technological  events  leading  up  to  the  development  of  the  com- 
puter. It  includes  the  early  work  of  Babbage,  desk  calculators, 
and  the  Bell  Labs  and  Harvard  calculators. 

Many  stories  can  be  read  from  the  chart.  For  example,  note 
that  the  early  Bell  Telephone  Laboratories  relay  calculator  was 
used  remotely  at  Dartmouth  in  1940,  about  20  years  prior  to 
remote  use  of  time-shared  computers.  Note  also  that  successhil 
manufacturers  tend  to  have  a  small  number  of  computer  families, 
but  add  members  as  the  technology  dictates.  (We  omit  the  exodus 
of  computer  companies.)  We  hope  the  reader  gets  as  much  en- 
joyment from  browsing  the  chart  as  we  have  (even  after  we  put 
it  together!). 

The  computer  space  in  Table  1  and  the  time  chart  in  Fig.  2 
provide  an  overall  framework.  We  are  now  ready  to  consider  each 
of  the  dimensions  individually,  starting  with  those  of  system  func- 
tion, then  the  performance,  and  finally  structure. 


SDS/Scientific  Data  Sys 


k  (21.  b/w)  910.920 


k(32  b/wJ    SICHA  7m  m'^  ^  

k((6  b/w)     SIGMA  2m   '  ■  ^ 

(12  b/w)  920  

 _93g  .925  ^^""^   S^^a^reJ.I  '^L 


nfC/Diqltal  Efluloment  Corporati' 


CDC/Control  Data  Corporatii 


GE/General  Electr 


k  (36  b/v.)    PDP-6  ft  -n 
02  b/w)  k 
k  POP-it  m_ 


PDP-8  LiNC-8  pnp-fl/s  POP-1 


Large  Scale  Sclent i  f I c 

'  (2'.  b/w)  92i.| 

I  I 
(12  b/w)  ku  t6QB  ■ 

CtS  b/w)  ku  itOlt.  160'tAl  


1700'I6  b/w) 
_J200      ^100  1300 


)60G. 8090. 8092 


(20  b/. 


GE  210  (6  d/w) 
CE  100  ERKA  (7  d/w)  > 

200  Series  (6  b/char)     IBM  IWl  based  k  ZOO 


(18  b/«)     OATANET  30 


Comcuier  Controls  Di' 


Oatamatic    lOOO  (12  d)  -LIlL^^hL 


tP-2l.»- 


(US  b/w.  stack,    multiprocessor)  B-5000  0-825    B5000      65000  B55O0       B8S0I    B65OO"  B^SO 
NOTE:     not  a  family  — •  ^  — 


(12  d/w-pl uflboard  program)  ElOl* 


(6b/char)  k  B250  | 
E102*  EIO3* 


e2?.C  B263  BI60 
8270  B273  6170 
B280       B283  BI80 


RCA/Radio  Corporati* 


501    (6_b/char>JT(6  b/cha_r)  601(56  b/w )^ 


^PECTRA  70  Ser 


(6b/char.  fi«d) 


(10  d. fixed)  I 
60l<* 


(7  b/char)  kUOl 
(10  d)    k  7070 
(6  b/char.  fixed  inst.)  kl62C 

15  705  III   '21 

 _650  (Ji  s- 


k  UlOjh- 
>l»^—  


3JOI^k-301)Compatible  1 
Process  Control 


err./ 360'! 


.701''- 7072  S^'si 


System/36a 

6'«00< 


360/30, 1«0, 50  360/65.75 


s-cal cu lators ) 


does  not  include  models  withdraw^ 

60.62,66.70 
360/92    3.60/9 1_  160/95  36O/65 


ing  Research  Assoc.  (Sc 


_EM/Eckert-MauchlY  (0^, 


Fjle  0  (12  d/w; 
^MVAC   n  b/w)  UMV"C 


r)    1050    (30  b/«-bcdJ 


NPL/National  Physics  Laboratory 

and  ACE  Based  Machines 
NBS/NationalBureau  of  Standards 


MUSE  -.ATLftS  ATLiS 


ATLiS-  1  ATLAS-. 


E  B^ndi>  G-1S  (one  level  i 
DvS£AC  PILOT 


Lincoln  Laboratory  MTC/Memory  Test  Comput. 
,     .(EDSAC  based) 


Rand  Corporation       JOHNNIAC   ^(tubes.  selectr.^  n^moQ  )    ^(magnetic  core) 
,     University  of  Chicago     MANIAC    1  CL*CA\aik4$^       I  1   (Not   IAS  ■ 


or  BRlTT 


■  SILLIAC.  CYCLOKE. 


>  same  des  i  gn) 


rs  i  ty  of  Pennsy 1  van  i  a 
School  of  Electrical  Engineer 
P-s  


EPVACfEckert .  Mauchh 
!  lect ronic  cl rcui  ts) 


a  Announcement  for 
H  Del ivered  f i  rs t 
#  Operational 
p  Paper 

;  Project  started 


sc  Scheduled 
w  Wi  chdrawn 

k  Reasonably  compatible  series 
ku  Upward  compatible 

■  Non-stored  Program  Calculator 


l?i(2    l^'.3    l^it    19'*tS    I9''<6  19^7 


T^l     1^:    1^?     I9'^i<    1^5    1^56    1^57    1^58     1?59    1^0    1^61    1^62    1^3     l^i.    1^5    1^66    1^7    t9G8    1^9  '?70 


Fig.  2a.  Time  chart:  computers  by  originator. 


CHARACTER  STRING  BUSINESS 


DECIMAL  WORD  BUSINESS 


SMALL  EARLY  SCIENTIFIC 


MACHINES  DESCRIBED   IN  BELL-NEVELL 


ilUJ  lO- 


'911'    ?9ii2   [9(43    I'gii'j  191.6  19^*8  iVg 


'951   1952   [953  ^ssii  fgsB  '956  1957   fgss  i%o~ 


'96I    1962     '963    '964  '965 


1967  1968 


Fig.  2b.  Time  chart:  computers  by  word  size. 


19'.0    I9M     j9^;     ig'.}  191.6    19'*7     ,'9^8  1950     19^1     ',952    195^     195'*     ;9S5  1957     1956    1959    1960     j96'     j9fe2    1963    196^    1^65     196^     I9fe7     ;968     j9fe9  t9 


elivered  first 


,1  ^;:-;,a.a: 

^  ;  s  Started 


5  J 


OPERATING  SYSTEMS 

DISCRETE  SIMULATION  LANGUAGES 


Hi  ii 

Hii    i  if 


LIST  PftOCESSING/STRING  MANIPULATION 
ALGEBRAIC  MANIPULATION  LANGUAGES 


HARDWARE  LINES 


PMS  STRUCTURE 
SECONDARY  MEMORY 
MEMORY  TECHNOLOGY 


PRIMARY  MEMORY 
[size.widlh;time) 


1 


11 


ij-Jil  lJ  


ALPAl       AMBIT  FORMAr 


LJLii-JjJiiiiikjj 


I 

J- 


One  N  Instruction  Buffers 


JTOSI. 


2  C  duplex     1  Pc-nP;o 
(S.cn        (I  WTO) 


Card  Mei 


Moving-head  Disk 


,rk;  Proceii^ia  (Rol>B''rts, 
6600llLehmann)  M.rrill) 


Core  Nemorv         Drum  "iriwini 

ter-niT) 


_^arge_ 


'■\S'ttVt-a^;^t;u. 

(LCS;ECS),, 360/851 


I'if 


llii 


1  All  «ii 


SI 


XI!; 


TX-O  Computer 


Transistor  [Bardeen. 


Vtr*sistor  co.-.put( 


Transistorized  Computers 


9TU_ 


'91.0   '9<il     '1942    i9l<3    wTTgiis    fgiie    191,7    '9'.8    1949    1950    liTi    1952    '953    !95i«     1955   1956  1957     1958    1959    i960    I961     1962    '1963    19611    [965    1966    1967    1968    '„(,,  1970 


Fig.  2c.  Time  chart:  technology. 


46  Part  1     The  structure  of  computers 


(1000  words, 
50  digits/word) 


ribniz  Calcula 


Bel  1  Telephone  Labs 


-Comp lex  Numbe  rs ,  II  -Re  I  ay  I  nte  rpo lator,  III  -  IV  -  IV  -Ballist 
ytvania  ENIAC  -  Electronic  Nutnerical    Integrator  And  Compui 


Harvard  Mar) 
BM  Mul 


•  Pascal  Calculator  Calculafcr      |*C.1-la.or/ . BH 

ihickhardt  Calculator  Baldwin  Calculator^  Columbia  U, 

escribed  to  Kepler)      L.X. Thomas  Arithmometer  Calculator  •  °^ 


for  Switching  Ci 


TELEGRAPH   *  ^ 

ELECTROMAGNrr#  TELEPHONE # 

MECHANICAL    MEMORY  !  ELECTRO- 


MECHANICAL 
 1  


:   UACuur^  tubes;  drums 


f  Opera 


Fig.  2d.  Time  chart:  pre-computer  technology. 


Function 

The  most  striking  fact  about  function  is  the  existence  of  only  a 
single  dimension,  and  with  only  a  few  values.  Perhaps  we  have 
taken  a  simplistic  view  of  the  functions  that  computers  perform, 
but  we  think  our  computer  space  represents  reality:  To  wit,  there 
is  remarkably  little  shaping  of  computer  structure  to  fit  the  fimc- 
tion  to  be  performed. 

At  the  root  of  this  lies  the  general-purpose  nature  of  computers, 
in  which  all  the  functional  specialization  occurs  at  the  time  of 
programming  and  not  at  the  time  of  design.  However,  it  might 
seem  that  specialized  environments  would  not  require  all  the  gen- 
erality, so  that  functional  adaptation  would  still  be  possible.  But 
this  appears  not  to  be  so  for  two  reasons.  First,  the  level  of  opera- 
tions of  the  Pc  (as  defined  in  the  ISP)  is  too  basic  to  reflect  the 
kind  of  specialization  ofi^ered  by  the  environment  (think  of  infor- 
mation-transfer or  conditional-transfer  operations).  Second,  all 
environments  ultimately  require  a  variety  of  tasks  in  addition  to 
the  main  specialized  task.  These  include  at  least  language  com- 
pilation or  assembly,  readable  formatted  output,  debugging  aids, 
and  other  utility  routines.  By  the  time  these  have  been  added,  a 
substantial  requirement  for  generality  has  been  generated. 

However,  this  is  not  the  whole  story,  A  second  part  is  the  differ- 
ence between  the  computer  type  and  the  specific  configuration 


assembled  for  a  task.  The  latter  is  often  carefully  specialized  to 
the  function  to  be  performed.  But  this  is  mostly  the  amount  of 
Mp,  the  amount  of  types  of  Ms,  and  the  number  and  types  of  T's. 
Within  limits,  these  are  all  items  that  can  be  attached  to  any  tvpe 
of  computer  (i.e.,  to  any  Pc)  and  are  handled  in  an  environment- 
independent  way.  Thus  there  is  little  specialization  of  computer 
types,  but  great  specialization  of  particular  configurations.  That 
this  should  be  the  case  indicates  something  about  the  nature  of 
the  fimctional  specialization — that  it  can  be  expressed  adequately 
in  gross  PMS  terms,  as  more  bits  of  storage  and  more  data  rate. 

There  is  still  more  to  the  story.  Some  fimctional  specialization 
exists,  as  indicated  in  the  dimension.  This  depends  primarily  on 
two  kinds  of  things  beyond  the  reach  of  the  conflgurational  adapta- 
tion described  above.  The  first  consists  of  demands  for  reliability, 
ruggedness,  small  size,  etc.  These  have  strong  effects  on  design, 
but  below  the  ISP  and  PMS  levels.  The  second  consists  of  demands 
for  large  amounts  of  processing  power.  One  response  to  this  again 
affects  design  at  the  lower  levels  of  logic,  devices,  and  circuitry 
and  has  little  impact  on  design  at  the  ISP  and  PMS  level.  But 
response  is  also  possible  in  terms  of  the  data-types  that  are  built 
into  the  ISP.  Large  machines  have  data-types  that  are  appropriate 
to  their  tasks  (with  operations  to  match),  and  these  affect  the 


i 


Chapter  3  j  The  computer  space  47 


design.  In  fact,  this  effect  is  the  substance  of  the  functional  spe- 
cialization shown  in  the  computer-space  dimension. 

Finally,  there  is  one  last  part  of  the  story,  and  it  is  the  most 
interesting  of  all.  Various  groups  of  computer  engineers  have  felt 
strongly  from  time  to  time  that  functional  specialization  should 
exist,  and  they  have  set  out  to  create  such  machines.  These  efforts 
have  often  produced  machines  that  were  different  from  the  exist- 
ing main  line  of  computers,  i.e.,  were  appropriately  specialized. 
B»t  the  net  effect  of  almost  all  such  attempts  has  been  that  the 
new  idea  was  seen  to  be  good  in  general  for  all  computers  and 
was  taken  back  into  the  main  line  of  computers.  Thus,  what  started 
out  to  be  a  functional  separation  turned  out  to  be  simply  a  way 
to  produce  rapid  development  of  a  more  universally  applicable 
computer.  A  classic  example  is  the  expansion  of  input/output 
facilities  in  creating  a  fimctionally  specialized  business  machine, 
which  simplv  led  to  better  I/O  facilities  for  all  computers.  We 
will  have  more  to  sav  about  such  examples  as  we  discuss  the  values 
along  the  dimension. 

Computer-system  function 

Scientific.  The  first  machines  were  clearlv  designed  for  scientific 
calculations.  In  fact,  .Aberdeen  Proving  Grounds  fvmded  the  early 
work  on  the  ENIAC  for  the  computation  of  ballistic  firing  tables. 
And  the  image  used  frequentlv  bv  the  early  computer  designers 
was  the  computer  as  a  statistical  clerk,  the  arithmetic  unit  being 
the  desk  calculator,  the  memory  the  work  sheet,  and  the  program 
the  instructions  that  the  mathematician  gave  to  the  clerk. 

From  a  design  standpoint,  scientific  computation  has  posed  two 
striking  requirements.  The  first  is  the  great  accuracy  of  the  num- 
bers, which  has  led  to  word  lengths  of  36  to  60  bits  (11  to  18 
decimal  digits  of  significance)  and  arises  from  the  propagation  of 
roundoff  error  during  repeated  arithmetic  operations.  The  second 
is  the  emphasis  on  fast  arithmetic  operations,  i.e.,  for  arithmetic 
power.  In  the  early  machines  the  standard  rule  for  estimating 
computation  times  was  to  count  the  number  of  multiplications  in 
a  program;  all  else  could  be  neglected.  The  arithmetic  unit  has 
developed  to  where  the  floating  point  multiply  is  hardlv  more 
expensive  than  floating  point  add.  This  requirement  on  fast  arith- 
metic, however,  has  reallv  been  directed  at  the  logical  design  level, 
not  at  the  ISP  or  PMS  level.  Thus,  the  main  efl'ect  at  the  ISP  is 
the  adoption  of  long  word  lengths,  floating  point  data-types  (in 
addition  to  integers),  and  an  extensive  repertoire  of  arithmetic 
operations  in  the  ISP.  The  main  PMS  effect  is  the  emphasis  on 
the  classic  "statistical  clerk"  PMS  design. 

The  press  for  increased  arithmetic  processing  has  led  in  recent 
times  to  the  development  of  various  forms  of  Pc  concurrency,  as 


in  the  look-ahead  of  Stretch  /Chap.  -34)  and  the  n-instniction  buffer 
of  the  CDC  6600  (Chap.  .39).  This  might  be  considered  a  unique 
functional  specialization  for  scientific  computation.  It  is  too  early 
to  tell,  but  it  is  our  impression  that,  although  the  needs  for  sci- 
entific computation  initiated  the  exploration  of  concurrency  and 
parallelism,  we  will  eventually  see  them  in  all  computers  above 
a  certain  power,  whatever  the  task  domain.  Physical  limits  on 
component  speed  and  signal  propagation  will  make  these  tech- 
niques universally  attractive. 

.\  better  case  for  permanent  specialization  can  be  made  in  the 
special  algorithm  computers,  which  compute  the  fast  Fourier 
transform  or  do  vector  operations.  Here  we  finally  have  systems 
whose  whole  design  is  responsive  to  a  narrow  class  of  problems. 
This  may  extend  to  the  very-  special  kinds  of  Pc  parallelism  exhib- 
ited by  the  ILI.IAC  IV  (Chap.  27),  although  there  is  substantial 
generality  in  such  systems. 

Buainess.  In  the  early  days  of  electronic  computing  it  was  felt  by 
many  that  there  was  a  major  functional  separation  between  busi- 
ness computing  and  scientific  computing.'  Scientific  problems  were 
"large  computing-small  input  output business  problems  were 
"small  computing-large  input  output.  '  Certainly  most  of  the 
existing  computers,  designed  for  scientific  computation,  had  poor 
input  ovitput  facilities.  The  IBM  701,  for  example,  used  the  Pc 
to  control  every  thing  dynamically,  actually  catching  the  bits  from 
running  tapes  on  the  fly  (by  executing  well-timed  small  loops). 
These  design  efforts  for  business  computers  resulted  in  the  IBM 
702  (and  subsequently  the  IBM  705,  708,  and  7080).  This  machine 
had  two  major  innovations  for  IBM:  It  used  characters,  and  it  had 
a  P.VIS  structure  that  permitted  more  flexible  and  voluminous 
input  output.  The  latter  feature  was  immediately  incorporated 
into  scientific  computers,  e.g..  into  the  709,  and  then  into  all  large 
scientific  computers  as  separate  input  output  control  (either  Kio 
or  Pio),  for  it  was  realized  that  there  were  also  demands  on  input/ 
output  for  scientific  calculation.  Thus  the  bifurcation  was  tempo- 
rarily halted. 

The  specialization  to  characters  as  a  basic  type  (as  opposed 
to  long  words)  was  already  present  in  the  IBM  702  but  did  not 
have  its  effect  until  5  years  later  with  the  development  of  the  IBM 
1401  (Chap.  18).  The  latter  machine  was  adapted  to  business,  both 
in  being  character-based  and  in  being  small  enough  so  that  small 
businesses  could  afford  it.  It  was  extremely  successful  (many  thou- 
sands were  produced)  and  certainly  represents  a  successful  func- 

'Such  feelings  are  still  extant,  but  we  are  concerned  here  not  with  the 
validit)'  of  the  feelings  but  with  what  the)'  led  to  at  a  particular  period 
of  computer  development. 


Part  1  I  The  structure  of  computers 


tional  specialization  for  business.  However,  it  is  interesting  that 
the  speciahzation  has  not  been  maintained,  for  the  IBM  Sys- 
tem/360 (Chaps.  43  and  44)  is  again  a  single  machine,  although 
it  has  in  essence  two  internal  ISP's,  one  centered  around  characters 
and  the  other  around  floating  point  data-types,  that  is,  a  business 
and  a  scientific  specialization  residing  side  by  side.^ 

Control.  The  third  functional  value  is  a  computer  used  for  control 
in  real  time.  Examples  are  process-control  computers,  aerospace 
computers,  and  laboratory  instrument-control  computers.  The  role 
of  the  computer  is  to  act  as  a  sophisticated  control  (K)  in  some 
larger  physical  process,  and  thus  it  plays  a  subordinate  role.  Their 
relatively  late  arrival  was  due  to  the  high  cost  and  unreliability 
of  early  computers,  as  well  as  to  the  lack  of  necessary  interface 
equipment. 

The  functional  specialization  is  seen  most  strongly  in  the  word 
size,  which  reflects  the  appropriate  numerical  data-type.  The 
numbers  used  in  control  processes  are  generated  by  physical  de- 
vices and  are  rarely  better  than  O.I  percent  accurate.  Since  elab- 
orate arithmetic  calculations  are  not  called  for,  the  numbers,  and 
hence  the  word  size,  can  be  around  12  bits.  Most  control  com- 
puters have  been  12  to  18  bits/word.  A  second  specialization,  again 
reflecting  appropriate  data-types,  is  that  all  control  computers  are 
binary  and  have  boolean  operations.  This  arises  because  many  of 
the  external  conditions  to  be  sensed  and  effected  are  binary  in 
nature. 

About  the  only  other  hmctional  specialization  of  control  com- 
puters is  the  interrupt'  capability  to  allow  them  to  respond  to 
many  potentially  simultaneous  external  conditions  in  real  time. 
This  provides  apparent  parallelism,  though  still  using  a  sequential 
processor.  This  is  another  possible  example  of  functional  speciali- 
zation leading  to  reunification  rather  than  divergence,  for  it  has 
again  been  widely  accepted  that  all  general-purpose  computers 
must  have  good  interrupt  capabilities.  However,  in  actuality, 
interrupts,  though  not  existing  in  early  computers,  were  developed 
to  obtain  good  input/output  facilities,  not  for  control  computers. 

Chapters  7  and  29  give  examples  of  aerospace  computers,  and 
Chap.  33  describes  the  IBM  1800,  which  is  specifically  designed 
for  process  control.  As  these  examples  .show,  a  complex  ISP  is  not 

^The  story  above  has  been  told  exclusively  in  terms  of  IBM  machines. 
Although  this  does  not  distort  the  picture  too  strongly  in  terms  of  total 
movements  of  the  field,  since  IBM  dominated  the  market,  concurrent 
developments  were  taking  place  throughout  the  field.  UNIVAC  I  was  the 
first  computer  built  by  a  manufacturer  and  did  not  have  the  idiosyncrasies 
we  ascribe  to  IBM;  on  the  other  hand,  the  marketing  effort  for  it  was  nil. 
2  Apparently  introduced  in  the  UNIVAC  1103. 


necessarily  required.  This  in  part  reflects  the  fact  that  control 
computers  may  retain  their  programs  over  their  whole  lifetime, 
so  that  programming  and  reprogramming  is  less  important.  (It  is 
not  absent,  however,  and  so  this  is  not  a  very  strong  functional 
adaptation.) 

Communication.  The  functional  specialization  of  communication 
could  be  taken  as  a  subfunction  of  a  control  computer.  The  function 
is  mainly  to  behave  as  a  switch.  In  a  message-switching  application 
the  computer  transfers  messages  from  terminals  (and  links)  into 
primarv  (and  sometimes  secondary)  memories  and  then  transfers 
them  to  other  terminals  (and  links).  In  message  switching,  messages 
are  first  stored  and  then  forwarded.  The  computer  in  a  telephone 
exchange  functions  as  a  very  sophisticated  switch  control.  Here 
the  computer  reads  the  off-the-hook  signal,  detects  the  dialed 
numbers,  rings  the  dialed  parties,  and  finally  sets  the  switches  to 
connect  the  telephones  together.  In  some  instances,  when  it  an- 
swers information  inquiries  about  new  telephone  numbers  or  re- 
routes calls  to  other  phones,  it  frmctions  as  a  memory.  Thus  a 
communications  computer  is  fimctionallv  a  switch  or  a  control 
for  a  switch. 

The  main  distinction  between  control  computers  and  commu- 
nications computers  is  that  the  task  environment  of  the  latter, 
since  it  consists  of  digitally  encoded  messages  (even  in  the  case 
of  the  voice  telephone  exchange),  can  be  handled  directly  bv  the 
communications  computer.  That  is,  the  communications  computer 
can  do  the  work  of  transshipment  and  storage  as  well  as  control. 

There  are  no  pure  examples  of  communications  computers  in 
this  book.  However,  the  Pio's  serve  essentially  the  same  fimction 
within  a  single  computer  (Part  4,  Sec.  1),  and  they  can  profitably 
be  examined  from  this  viewpoint. 

File  Control.  We  list  this  as  a  separate  specialization  only  because 
a  number  of  computers  have  been  built  to  do  exactly  this  task. 
The  specialization  is  easily  described:  It  is  a  communication  com- 
puter with  the  messages  being  characters  (since  they  are  built  for 
business),  and  with  the  large  memory  (the  file)  being  considered 
to  be  part  of  the  system.  There  are  no  examples  of  file-control 
computers  in  this  book,  but  the  early  IBM  305  and  UNIVAC  file 
computers  serve  this  fimction.  An  IBM  1800  is  used  as  the  control 
for  a  lO^^-bit  photo-optical  memory,  for  example. 

Terminal.  Since  it  is  possible  to  obtain  a  separate  computer  system 
whose  onlv  fimction  is  to  nm  a  display,  we  have  listed  this  as  a 
separate  fimctional  specialization.  In  fact,  it  is  better  viewed  (and 
almost  always  occurs)  as  a  component  of  a  larger  computer  system. 


Chapter  3  {  The  computer  space 


i.e.,  as  a  special  Pio.  The  DEC  3.38  is  such  a  P. display  and  is 
described  both  later  in  this  chapter  and  in  detail  in  Chap.  25. 

Time-sharing.  The  requirement  to  have  a  large  number  of  users 
in  simultaneous  conversational  interaction  with  a  single  large 
machine  has  bred  a  new  specialization,  that  of  the  time-sharing 
computer.  All  the  computers  described  above  can  be  time-shared 
(even  if  they  do  not  have  interrupts  or  inherent  multiprogram- 
ming). However,  the  emphasis  on  this  mode  of  operation  with  the 
particular  timing  and  flexibility  requirements  of  human  users  doing 
general  computing  at  consoles  in  multiple  software  systems  has 
led  to  a  number  of  innovations  in  design.  The  most  important 
is  the  virtual-memory  techniques  for  achieving  multiprogramming 
(described  in  Part  3,  Sec.  6).  There  is  also  substantially  increased 
complexity  of  PMS  stnicture  to  handle  the  integration  of  large 
files,  swapping  memories,  and  the  huge  software  systems  that  seem 
to  be  endemic  to  time-sharing  svstems.  It  is  still  too  early  to  tell 
whether  any  of  the  design  responses  will  produce  permanent  spe- 
cialization or  will  again  siniplv  be  the  first  instigation  of  design 
features  that  will  become  universally  used. 

In  summary,  we  see  that  there  is  fimctional  specialization  and 
that  it  translates  mostly  into  total  size  of  the  machine  and  into 
the  data-types  available.  Many  of  the  other  design  aspects  created 
in  response  to  fimctional  specialization  have  instead  become  the 
common  property  of  all  machines. 

Performance 

For  a  device  that  does  a  complex  job,  it  is  meaningless  to  ask  for 
a  single  precise  index  of  performance.  It  is  like  asking  for  the 
average  speed  of  a  given  model  of  car  over  its  lifetime  without 
specifying  who  will  own  it,  where  he  will  drive  it,  and  what  sort 
of  terrain  he  will  encounter  along  the  wav.  Notice  that  the  diffi- 
culty is  as  much  in  the  complexity  of  the  task  environment  as  in 
the  complexity  of  the  internal  workings  of  the  machine.  Specify 
everything  about  the  environment,  and  the  performance  can  often 
be  given  in  a  single  figure.  It  may  be  hard  to  determine,  but  at 
least  it  is  well  defined.  If  you  know  the  terrain  and  road  conditions 
perfectly  and  how  the  car  was  driven,  then  from  the  structure  of 
the  car  it  is  possible  to  figure  out  the  instantaneous  velocity  and 
from  this  to  construct  the  average  speed. 

To  put  this  in  terms  of  computers,  given  a  particular  configura- 
tion for  a  computer  system,  given  a  particular  program,  and  given 
a  particular  set  of  input  data,  it  is  possible  to  determine  all  aspects 
of  the  performance:  how  long  it  took,  how  much  space  was  used, 
whether  it  was  correct,  and  so  on.  But  we  are  not  interested  in 


such  specifics.  We  want  to  know  how  well  the  computer  system 
performs,  given  some  vague  notion  of  the  kind  of  task — programs 
and  data — that  will  be  used  with  it.  .Although  we  know  that  we 
cannot  have  adequate  measures,  we  believe  that  there  is  something 
that  can  be  said  about  the  performance — that  tells  us  that  a  CDC 
66(H)  is  many  times  more  powerful  in  actual  performance  than  a 
PDP-8. 

,\n  interesting  way  to  look  at  the  problem  of  specifying  perform- 
ance is  to  play  a  simple  game;  We  will  give  you  a  number,  say 
4.  You  are  to  give  the  best  description  of  computer  systems  involv- 
ing only  that  many  parameters  (equivalently.  dimensions  or  attri- 
butes). That  is,  what  is  the  best  description  of  a  computer  that 
can  be  stated  in  four  numbers?  The  game  is  easier  to  play  if  we 
speak  of  the  dimensions,  rather  than  the  information  content  of 
the  description  (in  bits,  say).'  We  have  still  not  defined  "'best," 
of  course.  It  can  be  taken  to  mean  the  best  prediction  of  the 
relative  ordering  of  the  computer  system;  better  on  the  index 
means  better  on  the  same  task.- 

To  start  at  the  beginning,  what  single  nvnnber  wtnild  you  give 
to  characterize  a  computer's  power?  Such  a  question  makes  most 
people  uncomfortable,  since  strong  feelings  e.xist  for  at  least  two 
kinds  of  numbers,  dealing  with  speed  and  memory,  respectively. 
If  forced,  we  would  probably  settle  for  something  related  to  proc- 
essing speed.  The  cycle  time  of  the  primary  memory  is  a  possibility 
because  for  simple  machines  it  determines  (limits)  the  operation 
rate.  It  is  a  stnictural  parameter,  but  that  is  no  reason  to  avoid 
it  as  a  performance  index.  The  average  number  of  instnictions  per 
second,  or  operations  per  second,  is  a  better  indicator.  Since  the 
latter  does  not  take  into  account  the  size  of  the  word  being  proc- 
essed, perhaps  average  bits  processed  per  second  is  the  best  single 
number.  (We  measure  this  number  at  the  processor,  and  it  may 
include  both  the  instmction  and  data  streams.) 

To  take  an  average  we  must  adopt  some  weightings.  The  sim- 
plest scheme  is  simply  to  add  all  the  instruction  (or  operation) 
times  and  divide  by  their  number.  This  is  equivalent  to  weighting 
them  equally,  the  rare  ones  and  the  common  ones.  If  we  want 
to  do  better  than  that  we  need  some  data.  Several  sets  of  relative 
frequencies,  of  instruction  types,  called  "mixes,"  have  been  used 
in  the  literature.  Table  2  gives  four  examples.  The  Gibson  mix  is 

'It  is  not  fair,  of  course,  to  invent  tricks  to  encode  many  conceptually 
independent  dimensions  into  a  single  one,  just  to  beat  the  limit.  On  the 
other  hand,  composite  dimensions,  such  as  average  operation  time,  are 
perfectly  acceptable. 

-Definitional  precision  is  not  appropriate,  since  we  are  not  attempting  to 
deal  serious!)'  with  the  technical  questions  of  indices,  only  to  illustrate  the 
issues. 


Part  1     The  structure  of  computers 


Table  2    Instruction-mix  weights  for  evaluating  computer  power 

Arljiirklc  [1966] 

Fixed  +  /  - 

6 

10(25)= 

25(45)- 

X 

3 

6 

1 

1 

2 

Floating  +  /  — 

9.5 

10 

Floating  X 

5.6 

Floating  ^ 

2.0 

Load  store 

28.5 

25  (move) 

Indexing 

22.5 

Conditional  branch 

13.2 

20 

Compare 

24 

Branch  on  character 

10 

Edit 

4 

I/O  initiate 

7 

Other 

18.7 

72 

74 

^Published  reference  unknown, 

=  Extra  weight  for  either  indirect  addressing  or  index  registers. 


probably  the  best  known.  The  best  source  for  such  data  comes 
from  instruction  counts  of  running  programs. 

Knight  takes  the  view  (Fig.  3)  that  a  single  number  can  be  used 
to  indicate  power,  and  his  formula  has  been  evaluated  for  some 
300  computers  [Knight,  1966].  His  formula  is  the  product  of 
three  factors:  processing  time,  memory  size  (in  words),  and  word 
length.  The  formula  was  derived  (roughly)  to  measure  power  so 
that  technological  change  could  be  modeled.  Applying  the  formula 
is  like  measuring  automotive-vehicle  power  as  a  product  of  speed, 
weight,  and  the  number  of  wheels.  (Such  an  indicator  is  roughly 
proportional  to  a  car's  momentum.)  Thus,  although  it  is  a  reason- 
able single-number  indication  for  power,  a  computer  buyer  could 
not  use  it  directly. 

Taking  averages,  as  in  the  case  of  mixes,  suggests  a  more  sophis- 
ticated approach.  A  collection  of  programs,  called  a  "bench  mark,  " 
is  developed  that  does  a  variety  of  different  tasks.  Then  the  one 
number  is  the  time  it  takes  to  do  this  collection.  Such  a  bench 
mark  generates  its  own  frequencies  of  occurrence  of  the  primitive 
instructions.  It  brings  in  a  number  of  additional  dimensions  that 
affect  performance:  the  instruction  code,  the  size  of  Mp,  pro- 
gramming skill,  input/output  devices,  etc.  It  also  carries  with  it 
an  implicit  frequency  of  different  kinds  of  task  demands  (how 
much  of  the  set  involves  compiling,  how  much  number  cnmching, 
how  much  I/O,  etc.). 

There  are  severe  practical  problems  in  carrying  out  such  meas- 
urements on  many  computers,  since  the  problems  must  be  coded 
and  run  on  all  the  systems.  It  is  somewhat  easier  if  the  task  set 


is  restricted  to  programs  coded  in  a  procediue-oriented  language, 
such  as  FORTRAN,  where  all  computers  accept  FORTRAN. 
Nevertheless,  although  it  has  often  been  done  to  compare  two 
systems,  only  occasionally  has  it  been  done  for  even  a  modest 
number.  We  feel  that  for  a  general-purpose  computer  the  com- 
piler-derived bench  mark  is  a  reasonable  single-performance 
number.  Much  actual  use  will  be  with  the  compiler,  and  good 
compilers  produce  code  to  rival  hand  coding,  so  that  .special  fea- 
tures of  the  machine  are  utilized.  Cox  [1968]  compares  several, 
using  hand  coding  and  compilers  for  several  tasks. 

There  is  a  difficulty  with  the  bench-mark  scheme  that  is  inher- 
ent in  its  strongest  advantage,  that  of  doing  a  total  problem  and 
thus  integrating  all  features  of  the  computer.  The  number  obtained 
depends  not  only  on  the  type  of  computer,  for  example,  an  IBM 
704,  but  on  the  exact  configuration,  for  example,  16  kwords  of  Mp 
versus  32  kwords,  and  even  on  the  operating  system  and  the  soft- 
ware (which  version  of  FORTRAN).  Thus,  although  the  number 
perhaps  comes  closest  to  an  adequate  single-performance  figure, 
it  becomes  much  less  of  a  parameter  characterizing  the  structure 
of  the  computer  than  one  characterizing  a  contingent  total  system. 

Let  us  underscore  again  the  distinction  between  the  computer 
type  and  the  particular  configuration  (possibly  including  basic 
software)  assembled  in  a  particular  installation.  Computer  systems 
are  designed  with  certain  forms  of  variability.  To  specify  a  CDC 
1604  is  to  specify  many  things,  such  as  the  ISP  of  the  Pc,  the  cycle 
time  of  Mp,  the  K's  used  to  control  secondary  memories  (Ms),  and 
interfaces  to  the  external  world.  But  it  leaves  open  many  other 


Chapter  3  |  The  computer  space  51 


[(L-fr)  (T)  (WF)]' 
p     ^  10'-  [33,000  (36-7)]' 
"  ]J)+  t,/o 

t,    =  10^[C,Af,  +  CApL  +  C,M  +  C,D  +  C-,L] 
t„o  =  P  X  OL,  [10«(W„  X  B  X  1  K,,)  +  (W,„  X  B  X  1  Koi) 
+  N(S,  +  H,)]  Ri 

+  (1-P)  OL,  [W  (W,,  X  B  X  1,  K„)  +  (W„.  X  B  X  l/Ko2) 
+  N(S;  +  H,)] 

V(iii(il)li  M — (ittrilfutcs  of  each  computing  stf-stcm 

P    =  the  computing  power  of  the  n'"  computing  system 

L     =  the  word  lengths  (in  bits) 

T     =  the  total  number  of  words  in  memory 

t,     =  the  time  for  the  Central  Processing  Unit  to  perform  1  million  operations 
ti  II  =  the  time  the  Central  Processing  Unit  stands  idle  waiting  for  I  0  to  take 
place 

=  the  time  for  the  Central  Processing  Unit  to  perform  1  fixed  point  addition 
=  the  time  for  the  Central  Processing  Unit  to  perform  1  floating  point  addition 
M    =  the  time  for  the  Central  Processing  Unit  to  perform  1  multiply 
D    =  the  time  for  the  Central  Processing  Unit  to  perform  1  divide 
L     -  the  time  for  the  Central  Processing  Unit  to  perform  1  logic  operation 
B    =  the  number  of  characters  of  I  0  in  each  word 

Kii  =  the  Input  transfer  rate  (characters  per  second)  of  the  primary  I  0  system 
K,,,  =  the  Output  transfer  rate  (characters  per  second)  of  the  primary  I  0  system 
K|j  =  the  Input  transfer  rate  (characters  per  second)  of  the  secondary  I  0  system 
Kiij  =  the  Output  transfer  rate  (characters  per  second)  of  the  secondary  I  0 
system 

Si  =  the  start  time  of  the  primary  I  0  system  not  overlapped  with  compute 
H|  =  the  stop  time  of  the  primary  I  0  system  not  overlapped  with  compute 

—  the  start  time  of  the  secondary  I  0  system  not  overlapped  with  compute 
H,  =  the  stop  time  of  the  secondary  I  0  system  not  overlapped  with  compute 
Ri   =  1  +  the  fraction  of  the  useful  primary  I  0  time  that  is  required  for  non- 
overlap  rewind  time 


Semi- 

'onstant  factors 

Values 

Scientific 

Commercial 

Symbol  Dc!icription 

computation 

computation 

WF 

the  word  factor 

a.  fixed  word  length  memory 

1 

1 

b.  variable  word  length 

memory 

2 

2 

C, 

weighting  factor  representing 

the  percentage  of  the 

fixed  add  operations 

a.  computers  without  index 

registers  or  indirect 

addressing 

10 

25 

computers  with  index 
registers  or  indirect 
addressing 


25 


45 


c. 


Wi„ 


w,,., 


OL, 


OLj 


Fig.  3.  Knight's  functional  model  algorithm  to  calculate  P  for  any  com- 
puter system.  (Courtesy  of  Datamation,  vol.  12,  no.  9,  September,  1966, 
page  42.) 


weighting  factor  that  indicates 

the  percentage  of 

floating  additions 
weighting  factor  that  indicates 

the  percentage  of 

multiply  operations 
weighting  factor  that  indicates 

the  percentage  of 

divide  operations 
weighting  factor  that  indicates 

the  percentage  of 

logic  operations 
percentage  of  the  I  0  that 

uses  the  primary  I  0  system 

a.  systems  with  only  a 
primary  I  0  system 

b.  systems  with  a  primary  and 
secondary  I  0  system 

number  of  input  words  per 
million  internal  operations 
using  the  primary 
I  0  system 

a.  magnetic  tape  I  0  system 

b.  other  I  0  systems 
number  of  output  words  per 

million  internal  operations 

using  the  primary 

I  0  system 
number  of  input  output  words 

per  million  internal 

operations  using  the 

secondary  I  0  system 
number  of  times  separate  data 

IS  read  into  or  out  of  the 

computer  per  million  operations 
overlap  factor  1— the  fraction 

of  the  primary  I  0  system's 

time  not  overlapped  with 

compute 

a.  no  overlap— no  buffer 
b-  read  or  write  with  com- 
pute—single buffer 

c.  read,  write  and  com- 
pute—single buffer 

d.  multiple  read,  write  and 
compute— several  buffers 

e.  multiple  read,  write 
and  compute  with 
program  interrupt— 
several  buffers 

overlap  factor  2— the  fraction 
of  the  secondary  I  0 
system's  time  not  over- 
lapped with  compute 


the  exponential  memory 
weighting  factor 


6 
2 
72 

1.0 
vanable 


20.000 
2.000 


0 
1 
0 
74 

1.0 
variable 


100.000 
10.000 


the  values  are  the 
same  as  those  given 
above  for  W,i 

the  values  are  the 
same  as  those  given 
above  for  Wi, 


1 

.85 
.7 
.60 

.25 


20 
1 

.85 
.7 
.60 

.55 


values  are  the  same 
as  those  given  above 
for  OLi.  a  through  e 

.5  .333 


52  Part  1  |  The  structure  of  computers 


things,  e.g.,  the  types  and  sizes  of  Ms  and  the  size  of  Mp.  On 
some  computers  it  can  even  leave  open  part  of  the  ISP  (e.g., 
the  multiply/divide  options  on  many  small  machines),  or  the  speed 
of  the  Pc  and  Mp  (e.g.,  in  the  IBM  System/360). 

When  we  ask  questions  about  computer  systems,  we  should  be 
clear  whether  we  are  talking  about  a  computer  "type,"  such  as 
CDC  1604,  or  whether  we  are  talking  about  a  particular  installa- 
tion, with  all  the  variability  specified.  It  is  possible  to  describe 
either  with  PMS  and  ISP,  provided  we  recognize  that  the  diagrams 
for  the  types  represent  maximal  possibilities  for  assembling  par- 
ticular systems.  This  is  how  almost  all  the  PMS  and  ISP  diagrams 
in  this  book  were  prepared.  From  the  point  of  view  of  our  "number 
game,"  if  we  are  talking  about  computer  types,  we  might  prefer 
numbers  that  do  not  depend  on  the  particular  configuration. 

If  two  numbers  were  available  for  describing  performance, 
what  would  they  be?  Clearlv  there  are  several  directions  to  go. 
One  could  fractionate  the  bench  mark,  so  that  one  has  a  bench 
mark  for  arithmetic-rich  tasks  and  a  bench  mark  for  others  (a 
composite  of  compiling  and  data  processing).  One  could  decom- 
pose the  processing  rate  into,  say,  operations  per  second  and  word 
size  (from  which  bits  per  second  can  be  recaptured  approximately). 
Alternatively,  one  could  retain  only  a  single  number  for  processing 
rate  and  add  a  measure  of  the  memory  available,  e.g.,  size  of  Mp 
(in  bits).  Of  the  three  we  would  choose  the  latter,  especially  if 
we  were  talking  about  a  particular  installation  rather  than  com- 
puter types,  for  which  Mp  size  remains  variable. 

We  can  continue  this  game  through  several  numbers.  Table  3 
shows  some  of  our  choices.  Various  parameters  drop  out  or  change 
only  when  they  are  decomposed  into  other  parameters  from  which 
they  can  be  recovered.  Thus,  initially  Mp  must  be  mea.sured  into 
bits,  but  when  the  word  size  is  given,  Mp  is  more  reasonably 
measured  in  words.  One  of  the  reasons  for  exposing  such  a  list 
is  to  emphasize  its  judgmental  and  approximate  character.  There 
is  as  yet  no  way  to  validate  such  proposals  for  brief  descriptions. 


If  we  had  bench  marks,  which  are  themselves  only  approximations 
at  measuring  performance,  we  might  look  at  how  well  the  param- 
eters in  Table  3  predict  the  bench  marks.  But  there  remain  the 
difficulties  of  how  to  take  into  account  the  additional  aspects  of 
the  total  system  (e.g.,  compiler  efficiency)  that  are  implied  in  the 
bench  mark.  Alternatively,  one  might  want  to  constmct  a  mixed 
description  of  bench-mark  numbers  and  measurements  of  the  kind 
in  Table  3.  Then  the  relationship  between  bench  marks  and  these 
other  measurements  would  become  an  indirect  measure  of  the 
efficiency  of  the  rest  of  the  system. 

We  have  discussed  performance  in  a  crude  and  cavalier  way, 
but  this  accurately  reflects  the  state  of  the  art.  There  are  no  precise 
measures  for  performance.  There  are  precise  stmeture  and  per- 
formance measures  of  individual  components  (e.g.,  memory  size, 
and  speed  and  word  length,  and  processor  instniction  times).  When 
designers  (and  users)  are  faced  with  obtaining  a  certain  total 
performance  for  a  given  cost,  the  only  method  is  that  of  the  bench 
mark,  because  the  task  is  such  a  significant  variable.  If  performance 
is  to  be  increased,  unless  the  task  is  sufficiently  trivial,  it  is  difficult 
to  predict  what  effect  changing  even  the  most  direct  structural 
variables  will  have  (e.g.,  memory  speed). 

Structure 

We  now  turn  from  function  and  performance,  which  provide 
design  constraints  and  objectives,  to  the  dimensions  of  structure, 
which  provide  the  space  in  which  the  design  is  actually  cast.  A 
structural  dimension  is  one  in  which  the  designer  can  attain  any 
of  the  values  along  the  dimension  by  relatively  direct  means.  Thus 
a  machine  is  completely  specified  by  listing  all  its  values  along 
the  stnictural  dimensions.  From  this,  the  system's  fimction  and 
its  performance  within  that  fimction  can  be  determined. 

What  dimensions  should  be  selected  for  structure?  The  view- 
point is  distinctly  different  from  that  of  performance,  where  one 


Table  3    Performance  parameters  specification 

(as  a  function  of  an  allowable  number  of  parameters) 


Number  of 
parameters 
allowed: 


1 


Parameters:  Pc(i.rate:(b/s))- 


Mp(size:(b))- 


Pc(operatlon-rate:(op/s))- 

Pc(i.width(b))  

»IVlp(i. (words))  

*Ms(i  ^wnrHq^l  


Chapter  3  |  The  computer  space  53 


averages  and  combines  many  features  to  summarize  effective  out- 
put. This  tends  to  obscure  structure.  For  structure,  one  wants 
maximallv  independent  aspects  which  are  easily  obtained  if  se- 
lected as  a  design  choice.  For  e.xaniple.  if  the  computer  designer 
had  only  a  single  dimension  to  describe  a  computer,  he  would 
undoubtedly  select  the  logic  technology  used  in  the  Pc  and  K's. 
This  tells  him  a  good  deal  about  many  aspects  of  the  computer's 
structure.  In  fact,  the  technology  and  the  average  bits  processed 
per  second  by  the  Pc  are  correlated,  and  so  each  can  be  used  to 
predict  the  other,  though  only  imperfectly.  If  one  is  interested 
in  performance,  effective  bits  per  second  is  preferred;  if  one  is 
interested  in  design,  technology  is  preferred. 

The  computer  space  in  Table  I  presents  our  choice  of  the  major 
structure  dimensions.  There  is  even  less  means  to  validate  the 
choice  of  dimensions  here  than  there  is  for  performance.  Never- 
theless, there  are  a  few  hallmarks.  Perhaps  the  most  important 
is  redundancy  (the  opposite  side  of  the  coin  from  independence, 
mentioned  above).  Several  dimensions  of  structure  mav  covarv, 
so  that  giving  any  one  of  them  is  tantamount  to  giving  the  others. 
This  covariation  need  not  come  from  physical  dependence;  it  mav 
arise  from  the  nature  of  an  appropriate  design  and  good  engineer- 
ing practice.  Such  a  cluster  of  covarying  dimensions  is  likely  to 
indicate  an  important  dimension  (which  one  among  the  correlates 
is  to  be  used  is  a  secondary  matter).  Table  1  is  organized  in  terms 
of  such  clusters,  with  one  of  each  selected  as  the  main  representa- 
tive and  placed  at  the  left. 

second  hallmark  derives  from  the  hierarchical  nature  of 
computer  systems.  Generally  a  description  of  a  system  consists  of 
the  union  of  the  description  of  its  parts,  plus  a  description  of  the 
interconnections.  This  is  the  basic  style  of  PMS,  for  example.  But 
there  are  a  few  features  that  affect  the  total  system,  i.e.,  affect 
many  components.  These  are  usually  rather  important.  Technolog)' 
is  a  prime  example. 

Yet  a  third  clue  is  that  the  dimensions  discriminate  the  actual 
popvilation  of  computers.  If  all  machines  had  single-address  in- 
structions, for  instance,  there  would  be  no  sense  in  using  number 
of  addresses  per  instruction  as  a  dimension.  Any  computer  engineer 
who  had  studied  machines  at  all  would  know  this  to  be  true  of 
all  computers.  Thus  one  looks  for  dimensions  that  spread  the 
machines  out  evenly  into  a  substantial  number  of  categories. 

If  the  dimensions  of  the  space  are  known,  a  computer  is  sup- 
posed to  be  defined  by  a  single  point.  For  most  existing  computers 
this  is  actually  the  case.  However,  if  a  computer  system  were 
complicated  enough,  say  consisting  of  several  processors,  each  built 
with  different  technologies  and  having  a  different  number  of  ad- 
dresses per  instruction,  then  such  a  representation  would  not  be 


possible.  For  instance,  the  Rice  University  computer  uses  vacimm 
tubes,  transistors,  and  integrated-circuit  logic.  But  such  complexi- 
ties are  rare:  time  and  good  engineering  practice  work  against 
it.  If  it  were  necessary  to  consider  such  cases,  then  additional 
dimensions  (e.g.,  for  secondary  and  tertiary  logic)  could  be  added, 
or  several  points  in  the  space  for  a  given  computer  could  be 
used. 

The  computer-structure  space  is  thus  our  choice  of  the  seven 
most  important  dimensions.  It  is  our  response,  so  to  speak,  to 
playing  the  mmiber  game,  given  only  seven  descriptors.  Thev  are 
arranged  in  order  of  importance,  although  clearh  no  simple  way 
exists  to  validate  such  an  order.  But,  if  we  were  to  have  only  three 
attributes  to  describe  the  structure  of  a  computer  system,  we 
would  pick  logic  technology,  word  size,  and  PMS  structure  (i.e., 
what  processors  exist  with  what  functions). 

.\t  this  point  we  are  read\'  to  proceed  through  the  space,  de- 
scribing the  various  dimensions  and  discussing  how  the  computer 
systems  in  this  hook  illustrate  various  points  along  them.  We  take 
up  each  major  dimension  separately.  A  few  of  the  correlated 
dimensions  are  accorded  separate  sections,  but  most  are  discussed 
along  with  the  main  dimension. 

Technology 

C^omputers  are  constrained  by  the  physical  technology  from  which 
they  are  constnicted.  It  is  not  just  that  new  technologies  provide 
greater  speed,  size,  and  reliabilit)'  at  less  cost,  although  of  course 
they  do  that.  But  technologies  dictate  the  kinds  of  structures  that 
can  be  considered  and  thus  come  to  shape  our  whole  view  of  what 
a  computer  is.  For  instance,  the  emergence  of  the  PMS  system 
level  is  due  to  advances  in  technology.  Prior  to  transistor  technol- 
ogy, it  did  not  make  sense  to  think  of  elaborate  P.MS  structures. 
The  costs  of  the  various  parts  were  too  high  and  the  reliabilities 
were  too  low.  W  hen,  occasionally,  such  a  machine  was  in  fact 
designed,  it  invariably  proved  too  far  ahead  of  its  time  to  succeed. 
.\n  example  in  this  book  might  be  the  RW-40,  described  in  1960 
(Chap.  38).  A  more  classic  example  is  the  .^nah'tic  Engine  of 
Babbage,  which  he  designed  in  1844  and  was  never  able  to  com- 
plete.' The  technology  of  the  time  was  entirely  mechanical,  and 
its  cnide  state  accounts  for  a  large  share  of  the  failure.  Thus  the 
technology'  is  by  all  odds  the  most  important  single  attribute  to 
know  about  the  computer  system. 

Many  technologies  go  into  making  up  a  computer.  Each  t\'pe 
of  component  typically  uses  a  different  one.  In  current  (so-called 

'Tfius.  the  first  real  digital  computer  established  the  precedent  of  failing 
hy  a  large  margin  to  meet  the  expected  dates  of  completion  and  full 
operation. 


54  Part  1  |  The  structure  of  computers 


third-generation)  machines  the  Pc  mav  use  hybrid-  and  inte- 
grated-circuit  technology  for  its  logic,  thin-film  technology  for  the 
Pc  generalized  registers,  core  technology  for  the  Mp,  electro- 
mechanical technology  for  tapes  and  disks  (with  integrated  circuits 
for  logic),  mechanical  technology  for  card  punches  and  type- 
writers, and  even  manual  technology  for  mounting  tapes  and  disk 
packs.  The  existence  of  all  these  technologies  poses  major  issues 
of  systems  balance,  issues  which  are  only  imperfectly  resolved.  For 
example,  it  remains  true  in  the  current  generation  that  input/ 
output  is  not  in  balance  with  the  internal  structures.  This  is  due 
to  the  crude  state  of  terminal  technology,  so  that  it  appears  to 
cost  too  much  to  provide  an  appropriate  solution.^ 

The  heterogeneity  of  technologies  is  not  a  consequence  of 
cost/benefit  analysis;  rather,  each  represents  the  forefront  tech- 
nology for  the  type  of  device  shown.  (There  is,  of  course,  cost/ 
performance  exchange  for  any  component,  but  this  is  usually 
within  a  technology.)  Thus  there  is  a  sense  in  which  the  leading 
technology  can  be  used  to  represent  them  all.  This  is  the  technol- 
ogy used  for  the  logic  level  and  is  the  one  listed  in  the  computer 
space.  If  it  is  known  that  transistor  logic  is  used  in  the  Pc  of 
a  computer,  it  is  a  safe  prediction  that  Ms  is  electromechanical, 
Mp  is  core,  Tio  is  electromechanical  printers  and  punches,  etc. 
This  reflects  the  fact  that  technology  develops  and  hence  be- 
comes locked  with  calendar  time.  Thus  a  prediction  is  from 
logic  technology  to  date  and  then  to  all  other  things  known  to 
be  current  at  that  date. 

This  correlation  of  date  with  technology  is  given  in  the  com- 
puter space  along  with  the  generation.  It  can  also  be  seen  in  the 
time  chart.  The  correspondences  must  be  taken  as  very  rough  only. 
The  technologies  are  listed  in  increasing  power  (and  decreasing 
cost).  The  dates  mn  in  exactly  the  same  order.  The  one  exception 
is  fluidics,  which  has  been  introduced  very  recently  and  is  a  special 
technology  for  ruggedness,  reliabilitv,  and  direct  external  coupling 
in  certain  control  systems.  (Small  fluidic  computers  are  at  the  early 
prototype  stage.) 

Alongside  the  technology  dimension  we  list  the  dimensions: 
Pc  speed  (operations  per  second),  and  cost  (dollars  per  million  op- 
erations), all  of  which  vary  directly  (or  inversely)  with  logic  tech- 
nology. In  general,  costs  are  extremely  difficult  to  determine,  espe- 

'  Although  beside  the  point  of  the  current  discussion,  one  reason  why  these 
imbalances  appear  to  be  "permanent"  is  that  the  time  constant  for  change 
in  the  technology  is  of  the  same  order  as  the  time  constant  for  himian  beings 
(i.e.,  systems  analysts,  programmers,  and  users)  to  understand  the  imbal- 
ance. Before  system  imbalance  is  diagnosed  and  solved,  the  terms  of  the 
problem  change,  inducing  new  imbalances. 

V        >  ' 


cially  when  technological  costs  are  of  interest  rather  than  market 
costs  (which  reflect  numerous  other  factors).  Nevertheless  the 
effect  of  technology  on  costs  has  been  so  striking  (while  simulta- 
neously pushing  up  performance  along  all  other  dimensions)  that 
it  seemed  necessary  to  give  a  measure  of  cost  in  Table  1,  no  matter 
how  crude. 

We  have  indicated  only  a  few  of  the  dimensions  that  are  corre- 
lated with  technology.  In  fact,  the  only  dimensions  in  Table  1  that 
are  independent  of  technology  are  the  word  length  and  the  Pc 
addresses/instruction.  All  the  rest  show  dependence  on  technol- 
ogy. For  some,  such  as  memory  speed  and  size,  there  is  a  direct 
correlation.  For  others,  such  as  PMS  structure  and  Pc  concurrency, 
the  development  of  more  complex  versions — the  leading  edge,  so 
to  speak — depends  on  technology,  but  there  is  free  use  of  all 
versions  that  are  in  existence  at  any  given  time.  There  are  still 
other  dimensions  of  importance,  not  shown  in  Table  1,  that  have 
also  changed  with  technology,  e.g.,  electric-power  consumption. 

One  way  to  see  both  what  varies  and  what  is  independent  of 
technology  is  to  compare  selected  machines.  For  iastance,  Whirl- 
wind (Chap.  6),  a  first-generation  system,  and  the  IBM  1800  (Chap. 
33),  a  third-generation  system,  have  reasonably  similar  ISP  descrip- 
tions, if  one  ignores  index  registers,  which  were  not  invented  at  the 
time  of  Whirlwind's  design.  However,  they  have  very  different 
PMS  structures.  In  Whirlwind,  the  early  system,  transferred  infor- 
mation between  Tio's  and  Ms  was  under  program  control  of  the 
Pc.  The  existing  Pc  registers  and  transfer  gates  were  used  because 
it  was  too  expensive  to  have  separate  ones.  In  the  1800,  which 
uses  hybrid  circuits,  it  is  economical  to  have  additional  subsystems 
devoted  to  special  functions;  hence  there  are  many  Pio's  operating 
independently  of  the  main  Pc.  It  was  not  cost  alone  that  limited 
the  complexity  of  first-generation  vacuum-tube  systems.  The  large 
physical  size  of  tubes  introduced  substantial  transmission  delays; 
their  large  power  consumption  added  dependency  on  a  cooling 
system;  and  their  limited  life  and  deteriorating  nature  constrained 
the  number  of  tubes  that  could  be  used  in  a  system  requiring  high 
reliability. 

The  IBM  700  scientific  series  (701,  704,  709,  7090,  7040,  7044, 
7094  I  and  II)  offers  another  comparison,  where  there  is  an  evolv- 
ing structure  over  time,  hence  across  technologies,  but  where  for 
reasons  of  compatibility  the  ISP's  have  remained  almost  constant 
(except  for  the  701).  Again  we  see  radical  increases  both  in  perform- 
ance (Pc  speed  increases  by  a  factor  of  5  from  the  701  to  the  704 
and  another  10  to  the  7094  II)  and  PMS  complexity.  But  various 
other  features,  though  not  affecting  compatibility,  were  locked  in 
with  the  ISP  and  remained  fairly  constant.  For  example,  Mp  size 
went  to  32  kw  (kilowords)  early  in  the  series  with  the  704;  and 


Chapter  3     The  computer  space 


it  took  a  jerrv-rigged  modification  to  get  64  kw  on  a  7094  toward 
the  end  of  the  lifetime  of  the  series  (see  Chap.  41.  page  517). 

Throughout  this  section  we  have  referred  to  technology  as  the 
dominant  factor  in  the  computer.  Does  this  mean  that  computer 
development  waits  upon  new  fundamental  windfalls?  We  have 
been  lucky  in  getting  the  transistor  and,  to  a  lesser  degree,  the 
integrated  circuit  from  external  efforts.  However,  core  memories 
were  invented  for  the  computer  and  resulted  because  of  need. 
Read-only  memories  have  also  resulted  both  from  development 
at  the  circuit  level  and  from  pressure  above,  requiring  the  mem- 
ories to  be  developed.  All  the  electromechanical  secondarv  mem- 
ories (i.e.,  magnetic  tape,  dmms,  disks,  and  photostores)  have 
resulted  from  the  computer's  needs.  Thus,  although  technology 
is  dominant,  the  computer  often  forces  the  development. 

The  Pc  operation  rate  is  strongly  correlated  with  logic  tech- 
nology, as  we  have  indicated  in  the  computer  space.  Our  discussion 
about  technology  and  generations  is  also  about  operation  rate.  The 
principal  reason  for  the  higher  operation  rate  is  because  of  faster 
logic  technology.  Technolog\'  also  has  a  secondarv  effect  on  in- 
creasing speed.  More  reliable  devices  allow  large  computers  to 
be  built.  Smaller  4evices  allow  higher  device  densities,  thus  de- 
creasing stray  capacitance  and  inductance  and  shortening  trans- 
mission delays.  Smaller  components  also  allow  increased  inter- 
connection densitv. 

Operation  rate  is  also  relatively  highlv  correlated  with  total 
performance.  If  we  hold  the  stnicture  and  concurrency  constant, 
the  simplest  wav  to  increase  performance  is  by  increasing  the  clock 
rate.  The  increase  in  the  performance/  cost  ratio  over  the  past  two 
decades  of  computer  evolution  has  made  their  primarv  gains 
through  higher  operation  rates.  The  two  16-bit  computers  already 
mentioned,  \\'hirlwind  (Chap.  6)  and  the  IBM  1800  (Chap.  3.3), 
provide  a  nice  comparison  of  the  e\olution.  With  a  difference  of 
10  years  and  two  generations,  their  cost  ratio  is  ~10:1  whereas 
performance  is  ~1:5  and  the  internal  clock  rates  are  also  ~1:5.' 

Information  structure:  word  length,  information  base, 
and  data-types 

All  computers  stnicture  their  information  in  a  hierarchy  of  units, 
which  we  defined  as  an  i-unit  in  Chap.  2.  For  example,  the  IBM 
Systeni/.360  starts  with  the  bit;  then  the  byte,  which  is  8  bits;  then 
the  word,  which  is  4  bytes;  then  the  record,  which  is  a  variable 
number  of  words.  In  between,  plaving  minor  roles,  are  decimal 

'However,  it  is  not  as  dramatic  an  e.xample  as  we  could  find.  B\'  picking 
a  better  third-generation  example  we  might  get  a  cost  ratio  of  ^  100:1  and 
a  performance  ratio  of  —1:10. 


digits  (4  bits),  the  halfword,  and  the  double  word.  A  number  of 
features  of  the  design  are  related  to  this  hierarchical  organization 
of  data.  Before  we  consider  them,  we  need  to  characterize  the 
organization  itself.  One  characteristic  of  this  organization,  the 
word  length  (in  bits),  gives  most  of  the  information,  the  rest  of 
the  hierarchy  adding  only  a  little. 

Let  us  see  why  this  is  so.  At  the  bottom  there  is  the  bit,  encoded 
in  two-state  devices,  .■\lthough  other  numbers  of  states  are  possible, 
and  ternary  (three-state)  machines  have  been  proposed  occasion- 
alK ,  digital  technology'  has  developed  exclusively  to  handle  binary 
information.  There  are  several  reasons  for  this.  The  first  is  the 
requirement  for  high  reliabilitv  and  high  signal-to-noise  ratios  in 
the  basic  devices.  Generally  a  basic  ;i-state  device  (that  is,  one 
not  built  up  from  other  t-state  devices)  is  realized  by  breaking 
a  continuous  physical  dimension,  such  as  voltage,  current,  or 
magnetic  flux,  into  n  discrete  levels  or  regions.  Reliability  and 
signal-to-noise  ratio  then  depend  on  keeping  adequate  separation. 
This  is  easiest  to  do  with  two  states  (e.g.,  in  the  limit  thev  become 
on-off  devices)  and  becomes  progressively  more  difficult  as  n  in- 
creases. The  second  rea.son  is  the  simplicity  of  the  logical  design 
for  binary  representations.  A  basic  device  for  combining  two 
ternary  digits  must  deal  with  3x3  =  9  configurations,  rather  than 
2x2  =  4  configurations  for  the  binary  case.  This  also  gets  worse 
as  ;i  increases. 

.\  final  reason — the  roup  dc  grace,  so  to  speak — is  that  no  one 
has  ever  found  striking  advantages  for  the  resulting  processing 
structure  in  having  more  than  two  states.  Thus  there  are  no  com- 
pelling reasons  to  suffer  the  first  two  disadvantages.  In  short,  what 
might  have  been  an  important  dimension  on  which  to  distinguish 
computers,  namely,  the  number  of  states  in  the  basic  encoding, 
turns  out  instead  to  be  one  of  the  great  uniformities  in  digital 
technology . 

Information  base.  That  the  physical  devices  deal  ultimately  in  bits 
does  not  imply  that  the  information  processing  must  be  organized 
in  terms  of  bits.  It  is  possible  to  select  an  arbitrary'  base  (one  with 
any  number  of  states)  and  construct  the  entire  ISP  in  its  terms. 

base  unit  is  represented  physically,  of  course,  as  a  set  of  bits. 
If  one  wanted  a  base  13  machine,  for  example,  one  would  have 
to  use  at  least  4  bits  (with  16  states)  to  encode  it.  But  no  operations 
at  the  ISP  level  would  refer  to  anything  but  base  units  and  data 
structures  built  up  from  sets  of  base  units,  and  there  would  be 
no  wa\'  to  manipulate  directly  the  bits  that  represented  the  base. 
Thus,  using  a  base  other  than  binary  obtains  whatever  advantages 
might  accrue  to  n-state  units,  without  any  of  the  disadvantages 
at  the  device  level. 


56  Part  1  I  The  structure  of  computers 


Computers  have  been  built  with  a  variety  of  different  bases, 
the  main  ones  being  binary,  decimal,  and  character.  The  character 
has  shifted  between  a  6-bit  character  and  an  8-bit  character 
(byte).^  The  arguments  for  bases  other  than  binary  (which  repre- 
sents the  natural  base  of  the  computer)  all  hinge  on  the  alphabets 
used  externally  by  human  beings  and  the  desire  to  avoid  conver- 
sions into  a  different  representation  inside  the  computer.  With 
universal  acceptance  of  higher  languages,  such  as  FORTRAN  and 
ALGOL,  this  argument  has  also  lost  much  of  its  force.  In  fact, 
all  third-generation  machines  are  binary.  Nevertheless,  in  the  fifties 
there  was  much  controversy  over  which  base  to  use,  and  the 
machines  presented  in  this  book  exhibit  all  three  bases. 

There  is  little  difference  between  binary  and  decimal  com- 
puters in  their  ISP  organization.  However,  there  is  a  great  differ- 
ence between  these  two  and  character  machines.  The  latter  are 
designed  for  handling  text  and  are  constructed  to  deal  with  varia- 
ble-length strings  of  characters.  Correspondingly,  they  deempha- 
size  numerical  computation.  Both  these  decisions  affect  the  ISP 
considerably.  Thus,  in  the  computer  space  we  indicate  the  base 
dimension  along  with  the  word-length  dimension.  The  two  to- 
gether make  up  a  single  dimension. 

Word  length.  Let  us  now  examine  the  role  of  word  length.  The 
word  is  the  first  major  information  unit  above  the  base.  It  is  defined 
as  n  bits  for  a  binary  computer  or  n  digits  for  a  decimal  computer 
(character  machines  being  excluded  as  not  having  a  fixed  word 
length).  Sometimes  there  are  intermediate  units,  but  they  always 
play  a  minor  role  and  we  can  disregard  them  at  this  stage.  As  we 
noted  earlier,  the  main  determinant  of  word  length  has  been  the 
fimction  of  the  total  system:  large  word  lengths  for  arithmetic 
systems,  small  word  lengths  for  control  systems  (and  character 
strings  for  business).  Thus,  only  within  narrow  limits  is  the  word 
length  a  free  design  choice. 

However,  the  interesting  thing  about  word  length  is  not  so 
much  its  determinant  as  the  way  it  affects  other  aspects  of  the 
total  system  design.  This  starts  with  a  design  decision  that  the 
unit  of  information  transfer  between  components  will  be  a  word. 
As  soon  as  this  becomes  the  case,  then  registers  in  various  com- 
ponents must  hold  a  word,  since  that  is  what  arrives  or  is  to  be 
transmitted.  Thus  the  word  becomes  the  information  unit  of  the 
Mp,  and  most  of  the  registers  of  the  Pc  hold  one  word.  The  instruc- 
tion is  designed  to  fit  into  one  word,  since  that  is  the  number 
of  bits  that  is  obtained  "at  once"  and  hence  can  be  used  to  effect 
the  next  time  increment  of  processing. 

'  Seven  bit.s  have  been  proposed  for  communication  purposes  but  have  never 
been  made  the  basis  of  a  machine,  as  far  as  we  know. 


Once  these  basic  features  are  set,  others  follow.  An  integer 
number  of  any  smaller  units,  such  as  the  character,  should  fit  into 
a  word,  since  otherwise  a  set  of  words  will  not  provide  a  homoge- 
neous sequence  of  subunits.  (That  is,  only  five  6-bit  characters  fit 
into  .32  bits,  so  that  a  set  of  32-bit  words  filled  with  6-bit  characters 
has  a  number  of  2-bit  holes  in  it.  This  can  complicate  algorithms 
that  deal  with  long  character  strings.)  The  constraint  of  compati- 
bility is  not  so  strong  with  Ms,  since  speeds  are  slow  enough  to 
permit  conversion  algorithms  (either  hardware  or  software).  Still, 
the  system  is  simpler  (and  therefore  usually  will  work  better)  if 
incommensurabilities  of  information  units  do  not  exist.  Thus,  to 
pick  an  example,  the  number  of  parallel  tracks  on  magnetic  tapes 
tends  to  divide  evenly  into  the  word  length.  IBM  tapes  for  the 
700  series  of  36-bit  machines  have  six  data  tracks;  for  the  Sys- 
tem/.360,  which  has  a  32-bit  word,  the  tapes  have  eight  data  tracks. 

There  is  an  interesting  correlation  between  the  word  length 
of  a  computer  and  the  number  of  data-types  that  it  makes  availa- 
ble. As  we  saw  in  Chap.  2,  the  operations  in  a  computer  can  be 
classified  according  to  the  type  of  data  they  operate  upon.  Each 
data  type  tends  to  have  a  certain  set  of  operations  appropriate 
to  it  (for  example,  -|- ,  — ,  X ,  and  /  for  numbers)  and  the  decision 
to  include  a  data-type  carries  with  it  the  decision  to  include 
its  operations.  Thus  the  number  of  operations  tends  to  grow  with 
the  number  of  data-types.  The  total  amount  of  hardware  in  a 
computer  grows  as  the  word  size  (because  data  paths  are  word- 
parallel^)  and  also  as  the  number  of  operations.  Thus  machines 
with  large  word  size  tend  to  be  large  machines  and  have  many 
data-types  and  many  operations.  ("Large"  as  an  adjective  for 
machines  invariably  means  big  and  expensive,  hence — given  eco- 
nomics— capable  of  doing  large  amounts  of  processing.) 

There  are  two  additional,  somewhat  independent,  features  that 
support  the  relationship  between  word  size,  number  of  data-types, 
and  size  of  computer.  First,  with  a  large  system  there  will  already 
be  available  many  of  the  pieces  necessary  to  add  additional  oper- 
ations. That  is,  the  marginal  cost  of  a  new  operation  goes  down 
as  the  system  grows.  Therefore,  given  a  large  system,  there  is  a 
tendency  to  add  more  operations.  The  number  of  operations  per 
data-type  is  not  easy  to  increase;  rather,  one  adds  new  data-types. 
Second,  with  small  word  lengths,  one  cannot  define  many  worth- 
while data-types  that  will  fit  into  a  word,  and  multiple-word  data- 
types are  left  to  the  programmer  to  define  with  software.  With 
large  word  lengths  there  are  many  different  worthwhile  data-types 
that  fit  into  the  word,  for  instance,  decompositions  of  the  word 
into  partial  words,  or  into  character  strings.  Each  of  these  requires 

^The  issue  of  bit-serial  versus  bit-parallel  is  discussed  subsequently. 


Chapter  3  |  The  computer  space  57 


additional  operations,  since  the  initial  data-types  involve  the  entire 
word  or  some  large  part  of  it  (i.e.,  the  word,  address,  and  integer 
operations). 

In  sum,  the  word  length  stands  as  an  indicator  of  many  aspects 
of  the  machine.  It  not  only  tells  something  about  the  basic  organi- 
zation of  many  components  but  indicates  how  big  the  computer 
is,  both  in  number  of  data-types  and  number  of  operations.  Figure 
2  shows  time  lines  of  well-known  computers  with  their  word 
length,  with  a  special  time  line  for  the  ones  in  this  book.  Five 
groups  are  suggested  in  the  figure  which  classif)'  these  computers.' 
The  classes  overlap,  and  to  separate  a  computer  into  one  of  two 
classes  requires  more  knowledge  (e.g..  the  nvuiiber  of  data-tvpes). 
For  example,  the  24-bit  SDS  9.300  and  CDC  .■}20()  appear  in  the 
same  class  with  the  .IB-bit  IBM  7090  just  because  both  machines 
have  floating  point  hardware  and,  in  fact,  perform  comparably  for 
arithmetic  tasks. 

The  one  design  choice  that  makes  word  length  have  few  of  the 
consequences  just  described  is  making  a  computer  bit-serial  rather 
than  bit-parallel.  In  many  machines  information  transfers  are  con- 
ducted on  a  single  bit  stream  (especially  Pc-.Mp  transfers).  Coinci- 
dent with  this  is  the  construction  of  operations  on  a  bit-by-bit 
basis.  This  works  well  for  arithmetic  and  logical  operations.  Time 
is  traded  for  hardware.  The  cost  of  the  system  becomes  independ- 
ent of  word  length,  but  the  processing  rates  go  down  correspond- 
ingly. This  design  decision  was  an  extremely  important  one  when 
logic  was  expensive  and  unreliable.  It  has  become  less  so  in  the 
current  era,  where  processors  and  transfer  paths  are  relatively  few 
in  number  while  both  the  cost  and  the  reliability  of  components 
have  improved.  However,  as  large  parallel  processors  are  con- 
sidered (~10^  P's),  bit-serial  processors  again  become  a  serious 
design  alternative.  (See  the  serial  computers  of  Part  .3,  Sec.  2.) 

In  summary,  word  length  is  an  important  dimension,  and  we 
find  many  characteristics  either  proportional  to  or  inversely  pro- 
portional to  it.  To  be  sure,  these  relations  hold  only  for  current 
design  practice,  as  we  have  seen  with  the  bit-serial  designs.  The 
main-line  computers  in  Part  2  are  ordered  according  to  increasing 
word  length. 

Data-types.  We  have  presented  the  number  of  data-types  as  being 
correlated  with  word  length  and  also  with  computer  size  through 
the  effect  on  number  of  operations.  Although  far  from  perfect, 
there  is  a  rough  order  in  which  specific  data-tvpes  are  included 
in  a  computer.  We  have  listed  the  main  types  in  such  an  order 
in  the  data-type  dimension  of  the  computer  space.  (See  Chap.  2 

'The  class  number  is  essentially  [logjiMp  word  length)  —  2). 


for  their  definitions.)  To  be  located  at  a  point  on  this  dimension 
(say  at  floating  point)  means  to  have  all  the  data  types  below  it 
on  the  dimension,  (i.e.,  word,  address,  integer,  boolean.)  Occa- 
sionally machines  which  violate  this  have  arisen.  Decimal  ma- 
chines do  not  generally  have  boolean  data-tvpes.  and  there  has 
been  some  attempt  at  machines  with  only  floating  point,  i.e., 
without  a  separate  integer  type  (e.g.,  the  CDC  G20''^). 

The  reason  behind  this  cumulation  of  data-types  in  a  fixed  order 
is  that  certain  general  tasks  must  be  performed  by  any  computer. 
It  must  transmit  data  between  the  Pc  and  Mp,  and  this  trans- 
mission has  nothing  to  do  with  the  meaning  or  content  of  the  data; 
thus  there  is  always  the  "unit  of  transmission,"  which  is  the  word 
(except  on  character  machines).  Next,  all  computers  manipulate 
addresses  to  achieve  generality  (e.g.,  to  compile),  providing  for  a 
second  data-type.  .Next  come  integers,  since  almost  all  algorithms 
make  use  of  arithmetic  (this  could  conceivably  be  absent  in  some 
communications  computers),  and  on  up  to  floating  point  numbers, 
multiple  precision,  and  vector  and  string  operations.  M  each  stage 
the  uses  are  more  specialized  so  that  lower  ones  cannot  be  elimi- 
nated, except  for  a  few  cases  such  as  handling  addresses  as  regular 
integers. 

Addresses  per  instruction  and  processor  state 

The  number  of  addresses  in  an  instruction  has  been  a  traditional 
way  of  describing  processors  (i.e.,  their  ISP's)  and  hence  the  com- 
puter systems  containing  these  processors.^  We  use  it  in  Parts  2 
and  3  to  separate  the  different  processors. 

Originally  the  dimension  was  simple:  one-,  two-,  three-,  and 
four-address  machines  were  constructed.  It  has  become  somewhat 
more  complex.  .\  "one  plus  one"  machine  has  one  address  for  data 
and  one  for  determining  the  next  instruction,  and  is  to  be  distin- 
guished from  a  two-address  machine,  which  uses  both  addresses 
for  data.  Index  registers  and  so-called  general  registers  provide 
instruction  schemes  which  lie  somewhere  between  one-  and  two- 
address  organizations.  \\'hen  processors  admit  several  instniction 
formats  or  variable-length  instructions,  matters  become  even  more 
complicated. 

.\  correlated  dimension  in  the  computer  space  is  the  amount 
of  processor  state,  that  is,  the  number  of  bits  that  exist  in  the 
processor,  as  described  in  the  ISP.  This  is  the  amount  of  informa- 
tion that  can  be  held  at  the  end  of  one  instruction  to  provide  the 
processing  context  for  the  ne.xt  instniction.  It  consists  of  a  number 
of  status  and  mode  bits  (in  modem  machines  packaged  into  regis- 

=  Originally  the  Bendix  G-20. 

^.Mthough  used  mostly  to  describe  Pe  s,  the  description  applies  to  any 
processor. 


Part  1  j  The  structure  of  computers 


ters,  but  in  earlier  machines  simply  scattered  around  in  the  proc- 
essor), the  next  instmction  address,  the  accumulator  and  other 
arithmetic  registers,  the  index  registers,  and  other  general  registers 
making  up  a  "scratch-pad"  memory.  It  is  a  simpler  descriptor  of 
the  ISP  than  addresses  per  instruction,  since  it  is  independent  of 
the  number  and  variety  of  instmction  formats.  It  is  easy  to  define 
processor  state  generally  for  any  ISP.  but  difficult  to  define  ad- 
dresses per  instruction. 

The  processor  state  is  not  the  total  number  of  bits  in  the  proc- 
essor, since  there  may  be  registers  in  the  physical  system  that  are 
used  within  the  interpretation  of  one  instruction  but  which  carry 
no  information  between  instructions.  Address  registers  for  obtain- 
ing operands  from  Mp  are  the  most  common  such  "underground" 
or  "temporary"  registers,  but  there  can  be  others.  We  implied  this 
distinction  by  defining  processor  state  in  terms  of  the  ISP  rather 
than  the  physical  processor. 

The  correlation  between  the  processor  state  and  the  number 
of  addresses  per  instruction  is  not  simple,  since  it  rests  on  two 
separate  issues.  For  the  first,  note  that  larger  programs  perform 
transformations  on  the  state  of  Mp  (or  even  Ms  or  Tio's)  and  are 
not  concerned  with  the  state  of  the  processor.  Processor  state 
enters  only  because,  in  decomposing  the  total  algorithm  into  a 
series  of  small  steps,  it  is  not  possible  (or  efficient)  to  make  each 
step  a  transformation  from  Mp  to  Mp.  Basicallv,  this  happens 
because  the  instruction  does  not  hold  enough  information  to  spec- 
ify the  Mp-to-Mp  transformations.  For  example,  if  one  wants  to 
add  two  numbers,  two  operands  are  required,  and  an  instruction 
must  contain  at  least  two  addresses;  if  it  does  not,  then  an  inter- 
mediate state  (i.e.,  processor  state)  must  be  created  to  hold  the 
information  while  the  additional  instructions  are  fetched.  Thus, 
one-address  organizations  require  the  most  processor  state,  with 
less  for  two-  and  three-address  organizations.  This  consideration 
stops  at  three  (two  operands  and  a  result)  because  only  a  few 
elementary  operations  are  more  than  binary.  The  processor  state 
cannot  be  eliminated  entirely,  however,  since  there  must  be  at 
least  an  instruction  address  (a  program  register)  to  maintain  con- 
tinuity of  the  program. 

The  second  source  of  correlation  between  processor  state  and 
instnictions  per  address  comes  from  differential  access  time  to 
processor  registers  and  to  Mp.  As  long  as  there  is  an  appreciable 
differential,  substantial  gain,  processing  power  can  be  obtained 
from  increasing  processor  state.  This  derives,  again,  from  the  struc- 
ture of  algorithms  which  generate  intermediate  results  that  are 
used  almost  immediately  afterward  and  then  are  of  no  further 
interest.  Rapid  temporary  storage  and  retrieval  are  beneficial 
under  these  conditions.  Thus,  working  against  higher  address 


organization  is  the  extra  time  to  store  in  Mp  results  that  need  onlv 
temporary  storage.  Thus,  also,  index  registers  and  general  registers 
almost  always  imply  increased  processor  state,  although  they  need 
not  do  so  logically  (that  is,  the  registers  could  exist  in  Mp  and 
still  have  their  effect  on  the  instruction  format). 

With  interrupts  and  multiprogramming  the  processor  state 
gains  additional  significance,  since  it  is  the  amount  of  information 
that  has  to  be  saved  and  restored  when  switching  programs. 
For  example,  in  the  Honeywell  H-800,  an  earlv  three-addre.ss 
computer,  the  processor  state  per  program  consisted  only  of  the 
program  counter  and  index  registers,  and  when  io-halts  occurred 
during  processing,  the  Pc  was  switched  immediately  to  another 
program.  Eight  programs  could  run  concurrently  (by  having  a  total 
processor  state  of  64  program  registers).  In  present  computers  with 
general-register  state,  often  25  ~  100  words  must  be  stored,  which 
implies  an  appreciable  time  for  switching  contexts. 

We  can  now  consider  briefly  the  different  organizations  accord- 
ing to  addresses  per  instruction.  To  show  the  common  similarities, 
we  give  in  Fig.  4  a  state  diagram  that  can  be  used  for  all  processors. 
In  common  is  the  basic  idea  of  the  stored  program:  Fetch  an 
instmction,  determine  what  the  instmction  is  to  do,  then  execute 
it  (the  fetch-execute  cycle).  Other  than  this,  only  a  part  of  the 
state  diagram  will  be  applicable  to  a  given  processor  type. 

As  shown  in  the  computer  space,  the  addresses-per-instruction 
dimension  starts  with  zero  addresses,  then  one  address,  then  one 
plus  indexing,  one  plus  general  registers,  and  on  up  to  two,  three, 
and  variable  addresses.  However,  from  an  expository  viewpoint 
one  should  follow  a  different  course,  starting  with  single-address 
machines,  then  indexing,  then  two-  and  three-address  machines, 
then  general  registers,  and  finally  the  zero-address  and  variable- 
address  organizations.  This  not  onlv  puts  the  more  common 
organizations  first  but  makes  it  easy  to  relate  the  organizations 
to  each  other. 

P(l  address)  and  P{1  +  index  address).  These  Pe  s  constitute  most 
first-,  second-,  and  simple  third-generation  computers.  The  earliest 
outline  of  the  stmcture  was  the  IAS  computer  (Chap.  4),  which 
has  come  to  be  known  as  the  von  Neumann  computer.  Although 
fimdamentallv  like  the  IAS  computer,  EDSAC's  adaptation  ap- 
pears to  be  the  closest  prototype  to  this  class.  Although  EDSAC 
is  not  described,  it  influenced  M.I.T.'s  Whirlwind  I  significantly 
(Chap.  6). 

A  significant  change  to  the  IAS  machine  was  the  addition  of 
the  index  register  (called  B-tubes)  in  the  Manchester  University 
machine  in  the  early  1950s.  The  evolution  can  be  seen  by  compar- 
ing the  first  and  third  generations  using  Whirlwind  (Chap.  6)  and 


Chapter  3  |  The  computer  space  59 


Store  name  Time  in  a  state  Meonmg 

soq/oq  toq  Operation  to  determine  the  instruction  q 

saq/oq  toq  Access  (to  Mp)  for  the  instruction  q 

so.  o/o.o  to.o  Operotion  to  decode  the  operotion  of  q 

sov.r/ovr  tov-r  Operation  To  determme  the  vorioble  address  v 

sav.r/avr  tav.r  Access  (to  Mp)  read  the  vonobie  v 

so/o  to  Operation  specified  in  q 

sov.w/ovw  tov. w  Operotion  to  determine  the  vorioble  address  v 

sov.w/ovw  tov. w  Access  (to  Mp)  to  write  vorioble  v 


Fig.  4.  ISP  interpretation  state  diagram. 


the  IBM  1800  (Chap.  33)  or  looking  at  the  IBM  701-7094  evolution 
in  Part  6,  Sec.  1.  Index  registers  are  motivated  by  the  frequent 
occurrence,  in  1  address  systems,  of  circuitous  address  calcula- 
tions that  involve  first  computing  the  address  (e.g.,  the  index  of 
an  array  in  Mp)  and  then  planting  it  just  ahead  in  the  instruc- 
tion stream  in  order  to  make  use  of  it  as  an  address.  Providing 
a  set  of  index  registers  introduces  a  second  address  into  the  in- 
stmction,  even  though  of  e.xtremelv  limited  fimction.  Thus  we 
classify  processors  with  indexing  as  having  (1  -t-  x)  addresses 
per  instniction.i  .\n  alternative  view  of  index  registers  suggests 
that  they  double  the  number  of  data-tvpes  bv  allowing  operations 
on  vector  data  elements  rather  than  just  scalars. 

'  Indirect  addressing,  on  the  other  hand,  does  not  add  to  the  addresses  per 
instruction;  rather,  it  introduces  a  second  operation  per  instmction. 


For  the  1  address  processor,  the  processor  state  (Mps)  tvpicallv 
consists  of  the  program  counter  (instmction  location  counter),  an 
.Accumulator  .\C,  a  .Multipher-Quotient  registerMQ  (the  e.xten- 
sion  of  .\C).  and  one  or  more  Index  registers  X  .\R. 

With  onlv  one  address  in  the  instruction,  the  one  arithmetic 
register.  A,  must  be  used  for  temporar)'  results.  Thus  an  effective- 
address  integer  (z)  is  computed  as  a  function  of  the  address  part 
(V  part)  of  the  instruction  (q)  and  the  index  registers.  This  process 
is  tvpicallv 


where  X[j]  is  the  jth  index  registei^s  specified  in  the  instruction. 

There  are  several  forms  for  the  transmission  operators  between 
A  and  Mp. 


Part  1     The  structure  of  computers 


\ 

A  ^ 

A  ^  Mpf: 
A  ^  Mp[Mp[x]] 
_  _  M^]  ^  A 

Mp[Mp[z]]  ^  A 


load  immediate 
load  direct 
load  indirect 
store  direct 
store  indirect 


In  indirect  operations  a  convention  may  be  required  to  determine 
what  address  in  Mp[z]  is  to  be  used. 

Similarly,  the  binary  operations  (  +  ,— ,  X,/,  A,  V,  @,  con- 
catenation, etc.)  are  generally  of  the  form^ 

A  ^  Ab  Mp[z] 

Rarely  do  we  find  the  symmetrical  operation  form 

Mp[z]  ^Ab  Mp[z] 

For  unary  operations  (— |,  — ,  abs,  sin,  cos,  etc.)  the  most  com- 
mon forms  are 


•u  A 

-  u  Mp[z] 


Rarely  do  we  find 


Mp[z] 
Mp[z] 


■u  Mp[z] 

-u  A 


In  both  the  above  cases,  exclusion  of  the  operations  that  place 
results  in  Mp[z]  stems  from  the  added  cost  of  including  the  sym- 
metrical function  and  the  marginal  utility  of  such  a  function, 
which  stems  from  the  result  of  applying  u  not  being  available  for 
further  processing. 

The  transmission,  unary,  and  binary  operators  account  for  al- 
most all  operations  in  these  computers.  If  we  allow  A  to  stand 
for  any  part  of  the  Mps,  rather  than  just  the  accumulator,  then 
the  instructions  not  included  above  are  input/output  data  trans- 
mission, e.g., 

Mp  <—  T       and       T  <—  Mp 

and  conditional  execution 

(branch  if  zero  AC)  ->  ((AC  =  0)  ^  (P  ^  z)) 

Having  index  registers  requires  operations  to  process  them.  At 
a  minimum  they  must  be  loaded  and  stored  (usually  from  and  to 
Mp),  i.e., 

Mp[z]  <—  X       store  index 

X  <—  Mp[z]       load  index  register 

^  Any  of  the  addressing  modes  suggested  above  can  be  used  for  an  operand: 
that  is,  2  immediate,  Mp[z]  direct,  and  Mp[MP[z]]  indirect. 


But  simple  operations  on  an  X  are  also  desirable;  for  example, 
X  ^  X  -h  1 

Here  X  is  used  to  point  to  (access)  the  next  element  in  a  vector. 
More  complex  operations  can  be  carried  out  by  placing  X  in  the 
A  register,  via  the  program  steps: 

A  <-  X  load  A  with  X 

A  <—  f(A)  manipulate  A 
X  <—  A  load  X  with  A 

An  operation  to  add  k  to  X  would  then  be 

A  ^  X;  next 
A  «—  A  -I-  k;  next 
X^A 

instead  of 

Mp[z]  <—  X;  next 
A  <—  Mp[z];  next 
A  <—  A  4-  k;  next 
Mp[z]  «—  A;  next 
X  <-  Mp[z] 

which  assumes  no  transmission  paths  between  X  and  A.  Ideally 
we  would  like  to  perform  any  operation  directly  on  X  as  simply 

X  ^  X  -I-  k 

From  this  begins  the  idea  that  X  should  look  like  the  main  arith- 
metic register,  A.  This  is,  no  doubt,  one  evolutionary  path  to 
general-register  processors. 

Part  2,  Sec.  1  is  devoted  entirely  to  1  address  computers  in 
the  first  three  generations.  They  were  the  "main  line"  of  computer 
development. 

P(2  address)  and  P{3  address).  The  computers  in  Part  .3,  Sec.  1 
have  instructions  which  contain  multiple  addresses  per  instruc- 
tion. The  addresses  (v)  specify  operands  in  Mp  (Fig.  4).  The  Mps 
decreases  as  the  number  of  addresses  per  instruction  increases, 
since  the  operands  need  not  be  held  temporarily  between  instruc- 
tions (i.e.,  each  instruction  performs  a  complete  operation). 
The  instruction  form  for  the  .3  address  computer  is 

Mp[v3]  ^  Mp[vi]  b  Mp[vo] 

where  b  is  a  binary  operator,  and  Vj,  V2,  and  Vg  are  the  addresses 
specifying  the  operands.  In  the  case  of  unary  operations,  u,  Vj  is 
usually  blank.  In  the  case  of  a  binary  operation  and  a  three-address 
computer,  the  states  are  oq,  aq,  00,  ov.r,  av.r,  ov.r,  av.r,  o,  ov.w. 


Chapter  3  |  The  computer  space  61 


av.w  (Fig.  4).  MIDAC  (Chap.  14)  and  Strela  (Chap.  1.5)  are  typical 
three-address  computers. 

A  2  address  computer  does  not  necessarily  require  more  proc- 
essor state  than  a  3  address  computer,  since  the  operations  can 
correspond  to 

MpK]  ^  Mp[v2]  b  Mp[v,] 

and 

Mp[v.J  ^  u  MplvJ 

However,  sometimes  e.xtra  Mps  is  usual.  The  R\V-4()()  (Chap. 
.38)  has  an  accumulator,  and  operations  generally  terminate  with 
results  both  in  primary  memory,  Mp(v^,],  and  in  the  accumulator. 
The  branch  on  accumulator  instructions  allows  results  to  be 
checked  directly  without  referring  to  Mp.  An  especially  nice 
instruction  in  2  address  computers  is  the  transmission  instruction 
(a  special-case  unary  operation):  Mpfvo]  «—  Mp[v,]. 

The  IBM  1401  (Chap.  18)  has  two  registers.  .A^address  and 
B„address,  which  hold  Vj  and  v^,  and  can  be  loaded  by  the  Vj  and 
v.,  parts  of  the  instruction.  These  registers  point  to  (address)  oper- 
ands and  do  not  contain  data.  The  remaining  processor  state  is 
the  Instruction^address.  The  1401  has  instructions  with  no 
address  parts,  and  these  instnictions  take  as  operand  addresses 
the  values  of  A„ address  and  Headdress  as  of  the  previous  in- 
struction. The  1401  instruction-interpreter  state  diagram  is  given 
in  Chap.  18  (Fig.  3).  The  state-diagram  specialization  (Fig.  4) 
is  roughly: 

oq,  aq,  oo  {ov.r^av.rj.ov.ro.av.rj.o.ov.w^av.Wj}  •  ■  • 

{ov.rj,av.rj,ov.r2,av.r2,o,ov.Wo,av.w.,} 

where  the  sequence  delimited  by  the  { •  •  • }  is  the  operation  on 
a  character;  because  the  1401  operates  on  variable-length  strings, 
it  is  repeated  until  the  end  of  the  string. 

P(n  -I-  1  address).  Processors  with  n  +  1  addresses  deviate  only 
slightly  from  the  n-address  processors  above.  The  final,  or  -|- 1, 
address  e.xplicitiv  specifies  the  address  of  the  ne.xt  instruction.  As 
such,  it  can  be  used  with  anv  instrviction  set.  There  are  two  reasons 
why  -f  1  addressing  is  used.  First,  freedom  is  provided  in  the 
placement  of  each  instruction  within  the  program  address  space. 
Second,  the  next  instruction  address  can  be  calculated  in  parallel 
with  the  execution  of  the  current  instruction. 

For  computers  with  cyclic  memories  (Part  3,  Sec.  2),  the  -I- 1 
address  allows  both  data  and  the  next  instruction  to  be  specified 
independentlv,  providing  the  opportunity  to  arrange  the  program 
and  data  in  an  optimum  fashion.  Since  each  instruction  completion 
time  depends  on  the  location  of  data,  it  is  desirable  that  the  next 


instruction  location  be  variable  rather  than  the  implicit  next  ad- 
dress used  for  most  processors.  This  is  almost  universal  practice 
in  computers  with  .Mp.cvclic  (see  LGP-30  in  Chap.  16  for  an 

exception). 

Microprogrammed  processors  may  use  the  +  1  address  to  locate 
the  next  instruction,  and  there  may  be  several  such  next  addresses. 
Microprogram  subroutines  tend  to  be  short  (intrinsic  to  interpret- 
ing an  instruction  set),  and  there  are  many  jump  addresses.  The 
increased  speed  from  not  having  to  compute  the  next  instniction 
address  is  worth  the  added  space  cost.  The  IBM  Svstem/360  Model 
.30  (Chap.  32)  shows  the  use  of  multiple  (-1-1)  addresses  and  if 
classified  according  to  our  scheme  would  be  at  least  a  P(  micro- 
program; 3  +  1  address). 

Pigeneral  register).  The  general  register  processor  has  a  small  array 
of  registers  that  can  be  used  for  multiple  functions.  These  have 
fast  access  compared  with  the  Mp,  so  that  it  pays  to  do  a.s  much 
processing  as  possible  within  them.  Since  the  general  register  array 
is  small,  it  requires  only  a  small  address  (3  to  8  bits).  Thus  the 
instruction  format  contains  fields  for  one  (or  more)  general  regis- 
ters. There  must  still  exist  addressing  for  Mp,  though  this  never 
e.xceeds  a  single  address.  Thus  we  classify  general  registers  ma- 
chines as  (1  -I-  g)  addresses  per  instruction. 

The  organization  of  a  (1  -I-  g)  system  can  vary  from  something 
very  close  to  a  (1  -I-  x)  organization,  in  which  es.sentially  every 
instruction  involves  some  Mp  information,  to  an  organization  in 
which  the  onlv  Mp  instructions  are  transfers  between  Mp  and  Mps 
(the  processor  state  holding  the  general  registers),  and  there  is  a 
two-  or  three-address  instruction  set  involving  only  Mps  (see  the 
CDC  6600  in  Chap.  39).  That  is,  from  a  data  point  of  view  the 
Mps  acts  like  a  directly  addressable  Mp. 

The  processor  state  of  a  general  register  processor  is  invariably 
held  entirely  within  the  general  register  array  rather  than  having 
additional  independent  registers).  This  is  due  in  part  to  an  already 
available  mechanism  (the  array)  and  in  part  to  the  need  for  pro- 
gram switching,  which  is  somewhat  simplified  by  having  all  the 
Mps  held  in  a  single  homogeneous  memory. 

The  general  registers  typically  perform  a  variety  of  fimctions: 

1  .\rithmetic  registers  (accumulator  and  the  accumulator  ex- 
tension for  the  multiplier-quotient). 

2  Index  registers. 

3  A  second  index  register  or  base  register;  if  the  program 
addresses  (v)  are  short,  a  base  register  is  needed  to  address 
anv  area  of  Mp. 

4  Subroutine  linkage  registers. 


Part  1     The  structure  of  computers 


5  Program  flag  (sense)  registers  for  boolean  variables. 

6  Stack  pointer  (P  may  have  multiple  simultaneously  active 
stacks). 

7  Address  pointers  to  data  arrays  and  lists. 

8  Temporary  data  storage  for  intermediate  results. 

9  Temporary  program  storage  for  short  program  loops. 

The  power  of  a  general  register  processor  is  obtained  because 
the  registers  can  serve  many  functions.  Thus  the  operations  on 
these  registers  can  be  extensive,  because  the  operations  need  not 
be  duplicated  in  other  parts  of  the  structure.  For  example,  special 
operations  for  index  registers  are  not  necessary  because  the  opera- 
tions for  integers  apply  universally  to  both  the  accumulator  and 
index  registers.  Of  course,  such  generality  requires  compromises. 
The  stack  computer  is  faster  for  problems  which  can  utilize  stacks, 
whereas  the  general  register  Pc  must  utifize  Mp  for  the  stack(s) 
and  does  not  have  the  encoding  efficiency  of  a  pure  stack  processor 
(see  below).  In  addition,  the  assignment  (and  reassignment)  of 
general  registers  is  most  crucial,  since  they  are  a  scarce  resource 
with  many  uses.  A  general  register  organization  allows  processors 
with  a  high  degree  of  parallelism  to  be  constructed,  since  several 
instniction  subsequences  can  be  executed  toncurrentlv. 

The  actual  number  of  registers  is  rather  critical  and  depends 
not  only  on  the  algorithms  of  tasks  coded  but  also  on  the  technol- 
ogy. In  multiprogramming  and  interrupt  computers,  the  program 
switching  time  increases  with  the  number  of  registers.  Thus  the 
upper  bound  on  the  number  of  registers  is  both  cost  and  program 
switching  time. 

We  would  expect  to  find  instructions  which  produced  the  fol- 
lowing affects. 


Format 

Addresses/ inxtrucHon 

G[g]  -  u  G[g] 

Ig 

G[g,]  -  u  G[gJ 

2g 

IVlp[v]  ^  u  Mp[v] 

1 

Mp[v,]  ~  u  Mp[v2] 

2 

G[g]  ^  u  Mp[v] 

1  +  g 

Mp[v]  ^  u  G[g] 

1  +  g 

G[g]  ~  G[g]  b  IVlp[v] 

1  +  g 

G[gi]  -  G[g,]  b  G[go] 

2g 

G[g,]-G[g,]bG[g3] 

3g 

Mp[v]  ^G[g]  b  Mp[v] 

1  +  g 

Mp[vi]     Mp[vj]  b  Mp[v3] 

3 

where 

u  are  unary  operators  (— 1|  —  |abs(  )|  —  abs(  )|etc.) 

b  are  binary  operators  (-l-|-|/|X|A|\/|®|  etc.)  ^ 

G  is  the  general-register  array 

g,  gi,  g2,  g3  are  instniction  parts  specifying  a  general  register,  G 
v,  Vj,  Vj,  V3  are  Mp  addresses  specified  as  a  function  of  instruction  and 
general  registers  (for  example,  v  :  =  (address  -|-  G[g])  or  v  :  =  (ad- 
^   dress  -|-  GlgJ  -|-  G[g2])  in  the  IBM  Systeni/360). 

General  registers  can  be  thought  of  as  an  outgrowth  (generali- 
zation) of  the  1  -I-  X  processors,  as  we  have  already  suggested. 
Alternatively,  they  can  be  thought  of  as  evolving  from  a  2  or  3 
address  structure.  The  UNI  VAC  1103  A,  a  2  address  processor 
(Chap.  13),  was  no  doubt  a  foremnner  of  the  general  register 
UNIVAC  1107  and  1108.  Pegasus  (Chap.  9)  is,  we  think,  about  the 
earliest  computer  to  use  general  registers  (1956).  In  Part  2,  Sec. 
2  we  discuss  four  general  registers  computers. 

P.stack  (0  addresses  per  instniction).  From  a  PMS  viewpoint  the 
P.stack  is  built  around  having  a  first-in-last-out  memory  (M.stack) 
as  part  of  the  processor  state.  Conceptually,  it  is  built  around  the 
fact  that  computations  can  often  be  .sequenced  so  that  no  explicit 
names  (i.e.,  addresses)  are  required  for  temporary  results.  All 
operations  are  performed  on  the  top  of  the  stack.  As  each  partial 
result  is  computed,  it  is  pushed  down  in  the  stack  and  appears 
again  to  participate  as  an  operand  at  exactly  the  appropriate  point 
in  later  calculation.  Thus  the  stack  operates  as  an  implicit  memory 
for  all  intermediate  products  and  not  only  are  transfers  between 
P  and  Mp  avoided  but  space  in  the  instruction  for  Mp  addre.sses 
is  eliminated. 

Instructions  in  such  a  system  consist  only  of  operations,  since 
all  their  operands  are  in  the  stack.  Thus  the  instruction  format 
is  that  of  zero  addresses  per  instruction.  There  must,  of  course, 
be  some  addressing  of  Mp  (just  as  in  a  general-register  organiza- 
tion). However,  the  addresses  for  Mp  themselves  sit  in  the  stack 
so  that  the  instruction  contains  onlv  the  transfer  (load  or  store) 
operation,  not  the  address.  There  still  must  exist  some  way  of 
getting  fresh  data  in  the  stack,  and  all  P. stacks  have  at  least  one 
operation  that  loads  an  address  written  in  the  program  stream  onto 
the  top  of  the  stack. 

Why  there  should  be  this  happy  correspondence  between  cal- 
culations and  memory  to  be  performed  and  stack  memories  re- 
quires a  little  explication.  It  rests  fimdamentally  on  the  phrase 
structuring  of  calculation  in  which  each  partial  result  is  required 
at  one  and  only  one  point,  so  that  each  subcomputation  can  be 
nested  in  the  program  (and  hence  its  result  nested  in  the  stack) 


Chapter  3  j  The  computer  space  63 


in  the  same  order  as  it  will  occur  as  operand  to  the  one  operation 
that  uses  it. 

There  are  several  arguments  against  a  P.stack.  Multiple  stacks 
are  often  required.  Part  of  the  power  of  a  P.stack  is  derived  from 
having  higher-speed  Mps  for  the  stack.  Yet  onlv  the  top  few  (2  ~  8) 
registers  of  the  stack  can  be  in  Mps.  When  M. stack  overflows  into 
Mp,  the  speed  of  operations  can  become  much  worse  than  not 
having  a  stack  at  all.  A  simpler  implementation,  for  example, 
P.general^registers,  is  as  fast  and  perhaps  more  general,  .\nother 
difficulty  with  the  stack  is  the  inability  to  access  other  than  the 
top.  If  full  addressing  is  provided,  tlien  the  organization  has  be- 
come almost  general  register.  Yet  another  difflcultv  arises  from 
inhoniogeneity  of  data-types,  especially  if  several  of  them  are 
packed  into  a  single  word  (the  width  of  the  stack).  Thus,  for  in- 
stance, in  one  stack  machine  (the  Burroughs  B  .5()()()  in  C'hap.  22) 
there  is  a  completely  separate  nonstack  ISP  for  string  manipula- 
tion. 

A  simple  numerical  computation  is  given  in  Table  4  as  a  com- 
parison of  the  P.stack,  P.l  address,  and  P.general„registers.  Here, 
the  P.stack  is  probably  shown  at  its  best  as  there  are  no  array- 
indices  calculations  or  program-flow  manipulations  involving 
testing,  etc.  The  criteria  we  measure  are  the  algorithm  encoding 
space  and  the  problem  nmning  time. 

The  kinds  of  instructions  interpreted  by  a  P.stack  are  typically: 


Interpreter  state 
Operation  seciiience  Example 

Load  oq.  aq.  oo,  ov.r,  av.r     M. stack-top  —  Mp[v] 

Store  oq,  aq,  oo.  ov.w,  av.w   Mp[v]  <—  M. stack-top 

Unary  operation    oq,  aq,  oo,  o(u)  M. stack-top     u  M. stack-top 

Binary  operation   oq.  aq,  oo.  o(b)  M. stack-top  ^  M. stack-top  b 

M.  stack-top-  1 

Variable  numbers  of  addresses  per  instruction,  .\lthough  there  are 
a  few  operations  that  require  the  specification  of  three  or  more 
addresses^  these  are  of  such  low  frequency  that  no  machine  has 
ever  been  built  (or  seriously  proposed,  for  that  matter)  that  has 
more  than  three  data  addresses  and  one  ne.xt-instruction  address. 
(Some  of  the  microprogrammed  processors  have  more  than  one 
next-instruction  address,  and  they  often  do  several  operations  in 
parallel  in  one  instruction.) 

However,  there  have  been  developed  processors  that  can  have 
a  variable  number  of  operands.  Most  of  these  involve  the  use  of 
an  instniction  that  is  larger  than  a  single  Mp  word.  Thus,  bringing 
in  the  first  word  of  an  instruction,  which  contains  the  operation 
code,  determines  how  many  additional  operands  are  needed  and 


hence  how  many  additional  words  to  obtain  from  Mp.  (In  a  char- 
acter-based system  this  may  require  several  reads  per  operand; 
in  a  word-based  system  this  ma\'  be  one  or  two  operands  per  read.) 
The  gain  in  such  a  system  is  the  higher  average  density  of  opera- 
tions per  instniction,  bought  at  the  price  of  extra  Mp  accesses. 

Most  such  variable-address  processors  have  a  mixture  of  one, 
two,  and  three  addresses  per  instruction — simplv  a  mix  of  the  types 
already  considered.  The  fundamental  limit  to  such  variability  is 
the  processor  state  (plus  the  additional  within-instruction  tempo- 
rary state).  This,  of  physical  necessity,  must  be  finite,  and  the 
number  of  addresses  must  yield  an  amount  of  information  that  is 
less  than  this  total  state.  Otherwise  the  processor  cannot  hold  onto 
it  to  process  it.'  Thus  the  various  processors  which  claim  to  operate 
from  a  higher  language  isee  the  P. languages  of  Part  4,  Sec.  4)  must 
in  fact  either  translate  into  another  simpler  programming  lan- 
guage, as  does  the  FORTRAN  machine  (Chap.  31),  or  become  an 
interpreter  which  processes  a  small  amount  of  a  language  state- 
ment before  the  rest. 

PMS  structure 

The  idea  that  there  is  significant  higher  organization  to  computers 
is  relatively  new.  Texts  on  logical  design  of  computers  develop 
a  model  based  on  an  arithmetic  section,  input/output  devices,  a 
memory  for  holding  instructions  and  data,  and  a  single  control 
to  force  the  other  components  to  interact.  A  PMS  diagram  of  an 
early  model  is  given  in  Fig.  5  (X  represents  an  external  agent, 
usually  a  man).  The  Whirlwind  I  manual-model  figure  (page  10) 
used  in  Chap.  I  was  rather  highly  developed  because  it  had  a 
secondary  memory  and  switching.  Figure  6  is  a  PMS  diagram 
which  reflects  this  more  accurate  model.  Often  computer  designers 
lump  the  devices  at  the  periphery  and  call  them  all  input  output; 
these  devices  are  both  input  output  terminals  (T)  and  secondary 
memories  (Ms). 

'  If  it  processes  a  large  amount  of  information,  but  in  pieces  (i.e.,  sequen- 
tially in  real  time',  it  is  not  reallv  executing  a  single  instniction  based  on 
all  the  addresses  but  has  decomposed  the  total  computation,  just  as  a 
single  address  organization  has. 


Fig.  5.  Early  model  of  a  stored  program  digital  computer  PMS  diagram. 


64  Part  1  |  The  structure  of  computers 


Table  4    Comparison  of  stack,  general  registers,  and  accumulator  Pc  for  evaluating  the  expression:  f  =  (a  —  b)/(c 

-  d  X  e) 

Pc.stack  [stack  cotiteitts] 

Pc.  general  register 

Pc.laddress 

Push  a  [a] 

Load  G[l],  a 

Load  d 

Push  b  [a,  b] 

Subtract  G[l],  b 

ivi u  1  u piy  c 

Subtract  [a  -  b] 

Load  G[2],  d 

Inverse  subtract  c' 

Push  c  [a  -  b,  c] 

Multiply  G[2],  e 

Store  temporary 

Pi  loh  rl  r^           k    /->  /H1 

r^usn  a  [a  —  o,  c.  aj 

Inverse  subtract  G[2],  c' 

Load  a 

Push  e  [a  —  b,  c,  d,  e] 

Divide  G[l],  G[2] 

Subtract  b 

Multiply  [a  -  b,  c,  d  X  e] 

Store  G[l],  f 

Divide  temporary 

Subtract  [a  -  b,  c  -  d  x  e] 

Store  f 

Divide  [(a  -  b)/(c  -  d  x  e)] 

Pop  f  [  ]  -  stores  stack  at 

location,  f 

Program  size: 

Address  integer/ai 

6  ai 

6  ai  +  8  ai(gr) 

S  ai 

o  ai 

Operation  parts/o 

4o 

7  0 

O  0 

Number  of  Mp  refer- 

ences for  data: 

Program  size  for 

fi  V  n  R  -1-  11 

D  A  ^        -t-  O  -t-  H  ^ 

8  V  C18  -t-  61 

hypothetical  example 

4x6 

1  X  (6  +  2  X  4=) 

machines: 

138 

182 

192 

Program  size  in  bits 

B8501':168 

IBM  System/360:208(above') 

IBM  7090:288(above') 

among  specific  Cs: 

:224(actual) 

360(actual) 

+  base  register  overhead 

(0  ~  192)^ 

'Not  an  instruction  in  the  specific  example  machines. 
^Assume  16  general  registers. 

^The  Burroughs  Corporation  B8501  Pc.stack  (discontinued). 

•"Not  completely  true,  since  System/360  has  only  a  12-bit  address  and  uses  base  registers.  Some  overhead  should  be  assumed.  Worst  case  (but  not  unreasonable)  is 
6  X  32  or  192-bit  overhead. 


If  we  separate  each  component  according  to  its  function,  assign 
control  (K)  to  each  element,  and  finally  introduce  the  processor 
(P),  we  get  the  structure  of  Fig.  7.  Of  course,  a  large  part  of  P 
is  a  data  operator  (D).  The  processor  has  the  behavioral  properties 
attributed  to  the  structure  of  Fig.  5.  If  we  include  the  control 
within  each  component,  we  get  Fig.  8  from  Fig.  7. 


I  "I 
I 

Mp  5  — ;  T  ( i  nput  .output )  -  X 

I  4_K  J 

1j 


Fig.  6.  Early  computer  model  (with  Ms  and  S)  RMS  diagram. 


To  consider  larger  structures,  consisting  of  several  Mp's,  P's, 
Ms's,  and  T's,  one  might  think  to  expand  the  system  as  shown  in 
Fig.  9,  in  which  we  connect  everything  through  a  single  switch. 
If  the  central  S  has  sufficient  power  for  multiple  conversations, 
this  indeed  provides  maximum  generality.  However,  although 


P-  K 
1  1 

Mp 

1  1 

S— 1  

T-X 

|\l 

1 

K  

  K  

1  1 

—  K 

Ms-K 
1 

X 

Fig.  7.  General  computer  model  (with  distributed  control)  PMS  diagram. 


Chapter  3  |  The  computer  space  65 


designs  have  been  proposed  for  such  a  system,  technology  and 
economics  have  so  far  prohibited  their  actual  realization.  Instead, 
there  has  developed  the  general  latticelike  structure  shown  in 
Fig.  10.  Each  switch  in  this  structure  connects  components  on  one 
side  with  components  on  the  opposite  side  (the  S  interconnecting 
the  P's  being  the  exception). 

The  lattice  structure  of  Fig.  10  is  hierarchical  in  the  sense  that 
the  Mp's  form  the  inner  core  and  one  travels  out  toward  the 
periphery  in  moving  from  left  to  right.  With  this  movement  there 
is  a  general  decrease  in  data  rate,  being  highest  through  the  Mp-P 
switch  and  lower  as  one  moves  to  the  right. 

The  model  has  five  switches  (S).  One  switch  connects  the  com- 
puter's peripheral  devices  with  the  external  environment  (human 
beings,  other  processes,  etc.).  Three  switches  appear  alike  in  the 
way  they  interconnect  Mp-P,  P-K,  and  K-(T|Ms),  respectively. 
However,  they  are  usually  quite  different.  We  would  expect  any 
P  to  connect  with  any  Mp.  We  probably  would  expect  to  have 
only  one  or  two  Pio's  connected  to  a  given  set  of  K's.  Most  cer- 
tainly one  or  two  K's  would  manage  a  given  set  of  Ms's  or  T's. 
Thus  the  stnicture  nearest  the  peripherv  becomes  more  like  a  tree, 
rather  than  a  lattice  (e.xamples  are  provided  in  Figs.  11  and  12). 
The  last  switch  in  Fig.  10,  unlike  the  above  four,  provides  inter- 
communication among  the  processors.  In  anv  multiprocessor  struc- 
ture (even  IPc-nPio)  there  must  be  commimication  among  the 
processors.  A  switch  of  this  type  is  organized  as  a  nonhierarchy 
and  appears  like  a  conventional  telephone  exchange,  since  anv  P 
can  call  another.  On  the  other  hand,  the  amount  of  communica- 
tion (measured  in  bits)  is  rather  low. 

The  P's  and  (usually)  Mp's  have  their  controls  associated  with 
them,  and  we  have  not  bothered  to  show  such  K's  in  the  diagram. 
The  K's  that  are  shown  provide  control  for  the  T's  and  Ms's.  These 
are  separated  in  the  figure  because  they  are  separated  in  current 
computer  svstems  and  made  into  identifiable  physical  components. 
Under  current  technologv  thev  are  expensive  devices,  so  that  one 
K  per  T  or  Ms  is  not  economical.  Therefore,  each  K  needs  to  be 


P 
1 

^     P  .  .  . 

1 

Hp  — 

r 

5   

-  T-X 

Md  — 

Hp  - 

-  T-X 

1 

Ms 

1 

Is  Ms  .  .  . 

1 

X 

Fig.  9.  General  computer  model  (with  multiple  components)  PMS 

diagram. 


periphery 

^  X (human [computer | network  jmechan  i  ca 1  process) 
whe  re 

Pio         — Pio—  I  —  Kio  — 

K       ;=        njl  1  I— K—  I  — K  — K  — 

T       —    — T —  [ — K— T  — 

Ms      :=    —Ms—  I   — K —  Ms — 


Fig.  10.  General  computer  model  (multiprocessors)  PMS  diagram. 


Mp- 


L  T_ 
i-Ms- 


■-Ms 


-5fx 


Fig.  8.  General  computer  model  (without  K)  PMS  diagram. 


Fig.  11.  Tree-structured  computer  (IPc)  PMS  diagram. 


Part  1     The  structure  of  computers 


shared  among  a  set  of  T's  and  Ms's.  (That  is,  one  purchases  a  single 
magnetic-tape  controller  for,  say,  four  magnetic  tapes.)  The  shared 
K  also  explains  why  only  one  of  a  given  class  of  devices  (e.g., 
magnetic  tapes)  can  operate  at  a  time.  As  technology  changes 
(especiallv  costs),  these  separate  K's  may  disappear. 

Nearly  all  the  computers  discu.ssed  in  this  book  fit  the  lattice 
model  of  Fig.  10.  However,  it  is  not  unlikely  that  structures  will 
be  or  have  been  built  that  do  not  conveniently  fit  it.  For  example, 
NOVA  (Chap.  26)  does  not  fit  the  model  nicely,  although  the  more 
complex  ILLIAC  IV  arithmetic-computer  portion  (Chap.  27)  does. 

The  values  along  the  PMS  structure  dimension  of  the  computer 
space  have  been  generated  from  the  general  model  and  laid  out 
in  the  order  of  their  evolution.  This  evolution  is  strictlv  from  less 
complex  to  more.  The  seemingly  more  complex  network  structures, 
such  as  the  duplexed  computers,  are  not  necessarily  as  complex 
as  a  single  multiprocessor  computer.  Duplex  computers  have  been 
used  for  some  time.  The  slow  evolution  to  the  parallel  processor 
structure  is  due  primarily  to  limitations  in  technology.  A  stnic- 
tured  computer  with  a  distributed  control  is  more  expensive  than 
a  tightly  integrated  design  with  shared  fimction.  In  addition, 
multiprogramming — a  question  of  software — must  be  present  to 
allow  multiprocessing. 

The  PMS  structure  plavs  only  a  minor  role  in  obtaining  multi- 
processing and  parallel  processing.  The  classical  debate  about 
building  large  computers  has  always  been  resolved  by  building 
a  single  large  processor  (e.g.,  the  CDC  6600  and  Stretch,  Chaps. 
.39  and  .34).  Proponents  of  multiprocessors  say  that  one  can  always 
add  several  large  processors  to  a  structure  and  increase  the  per- 


computer  boundary 
(peri  phery ) 


Fig.  12.  Tree-structured  computer  (lPc-2Pio  and  lattice  Mp-P  switch) 
PIVIS  diagram. 


formance  of  a  one-processor  structure.  In  Part  6,  Sec.  3,  when  we 
discuss  the  IBM  System/360,  we  advocate  multiprocessing. 

Today  there  is  no  parallel  processing  in  the  form  suggested 
in  Chap.  37.  We  include  a  discussion  of  parallel  processing  on  the 
bet  that  it  will  come  in  the  future.  Part  5  is  dedicated  to  moving 
along  the  PMS  stmcture  dimension. 

The  simple  1  Pc  structure  shown  in  Fig.  11  is  a  tree.  Although 
there  are  no  values  on  the  information  rates,  the  nature  of  the 
fixed^  and  time-multiplexed  switches  indicates  that  perhaps  the  top 
two  T's,  one  Ms,  and  one  of  the  bottom  T's  can  all  be  active  at 
a  given  time.  In  Fig.  12  a  1  Pc,  2  Pio  computer  is  given.  Here 
we  note  that  the  control  of  one  secondary  memory  is  by  a  Kio 
rather  than  the  Pio.  (The  Kio  cannot  fetch  its  next  instruction  from 
Mp  and  must  rely  on  Pc  for  control.)  Note  that  there  is  necessarily 
a  lattice  connection  between  the  2  Mp  and  the  Pc,  2  Pio,  and 
Kio.  The  special  cases  of  P. displays  multiprocessors,  P(array  |  wired 
algorithm),  and  parallel  processing  are  all  realized  from  the  general 
model  of  Fig.  10. 

Switching 

A  principal  issue  of  a  computer  design  at  the  PMS  level  is  switch- 
ing (as  we  indicated  in  the  preface).  Unfortunatelv.  we  do  not 
illuminate  switching  problems  in  this  book  except  to  provide 
examples.  The  switching  dimension  of  the  computer  space  is  cor- 
related with  PMS  structure,  as  we  have  just  seen.  To  have  a  more 
complex  structure,  more  complex  intercommunication  (switching) 
is  required.  Figure  13  shows  the  various  logical  switches,  together 
with  some  of  the  more  common  implementations.  The  switch 
parameters  are  also  given  in  the  .Appendix  of  this  book.  Each  of 
the  switching  issues  will  be  di,scussed  in  turn  as  they  apply  to 
various  parts  of  the  structural  model  (Fig.  10).  The  reader  should 
note  that  Fig.  13  has  relatively  primitive  switches.  More  complex 
switches  can  be  formed  by  cascading  (coimecting)  the  primitives 
together.  (A  noncomputer  example  is  the  manner  in  which  tele- 
phone exchanges  are  constructed  and  interconnected  together.) 

Pwcessor-memonj  sivitching.  Only  recently,  with  the  advent  of 
multiple  processors,  has  memory-processor  switching  become  an 
important  problem.  But  the  Mp-P  switch  makes  multiprocessing 
possible,  and  it  is  a  determining  factor  in  both  performance  and 
reliability. 

The  structure  of  the  processor-memory  switch  for  computers 
which  have  multiple  memories  and  multiple  processors  is  a  lattice 
if  simultaneous  memory/processor  dialogues  are  allowed.  A  cross- 

'.\  relative  value  for  the  attribute  that  denotes  the  time  a  switch  is  clo.sed. 
Fixed  usually  denotes  a  time  duration  such  that  more  than  1  i-unit  is 
transmitted. 


Chapter  3  |  The  computer  space  67 


Group  I.  Hierarchical  switches  for  connecting  a^  comDonents 
to  bp  components  for  2-way  conversations.     The  logical 
structures  are  first  given,  followed  by  common  physical 
realizations.     For  the  physical  realizations  links  are 
required  between  pairs  of  components.     Not  all  physical 
realizations  are  given;   It    is  assumed  the  roles  of  the  a's 
and  b's  can  be  interchanged. 

-5-b, 
J   5 (gate;   1  a;   1  b) 

—  L  —S  —  b, 


a  S(gate;  switching  at  b) 


b  5 (gate;  switching  at  a) 


—  S  — L  —  S  — 

c  S(gate;  switching  at  a,b) 


—  S  (duplex)   .  _  b 


.2   (duplex  1  a:  n  b:  concurrency:!;  n  S.qate) 

p  L  —  S  —  b| 
a|--  L  —  S  —  b^ 

—  L—  S  — b 

n 

.2a  S(dLiplex;  radial;  switching  at  b) 


I-  S—  L  —  b| 
-S —  L —  b. 


S —  L —  b 

n 

,2b  S(duplex;   radial;  switching  at  a) 


5  —  b. 


-S  b 

n 

.2c  S|  duplex;  bus/chain;  commonly  used  for  K-T, 
interconnection 


S  [duplex 
[f-K  in 


a,_psrdual--|^b, 
Lb 

3  s ( dua 1 -dup I  ex ;  2a;  n  b;  concur rency ; i ;  2  n  5. gate) 


.3a  S  (dual -duplex;  radial;  switching  at  b,  duvlex  veriion  of  .2a) 


Ls  ■ 
S-s  ■ 


IT' 


.lb  s (dual -duplex;  radial;  switching  at  a,  duplex  version  of  .2b> 


a,_L- 
—  L- 


J' 


.3c  S ( dua 1 -dup I  ex ;  bus/chain;  duplex  version  of  .2c) 


T?  time-multiplex-  -j— 
c  ross -poi  nt 


,U  S  t ime-mul t i plex;  cross-point:  m  a;  n  b:  concurrency;!; 
n  +  n  S.gate;  cascade  of  2  duplex 


a,        L —  S- 


1-5—  L  —  b, 


—  L  —  S--L--5 —  L  — b 


a  —  L — 5-1 


1-5—  L — b 

n 

. ^a  S  1 1 me-mul t i p 1  ex;  cross-point;   radial;  central 
concurrency : 1 


Fig.  13.  Logical  and  physical  switch  structures  PMS  diagrams. 


Part  1  I  The  structure  of  computers 


S  ( t  i  me-mu  1 1  i  )>!ex ;  cross-point:  bus/chain) 
5  (cross-poi  nt  )-i — b  , 


— 


1 


.5    cross-point:  m  a:  n  b:  concurrency :m 
m  X   n  S . gate 


i  n  {m  ,  n  )  ^ 


°2 

a  —  L  - 


'5 

^5 

'S, 

L  L  L 
1         I  I 


.5a  5 (cro55-po i nt :  radial:  Links  to  a  or  b  may  be  null) 


I  2  n 

.5b  5  (cross-poi  nt :  bus/chain:   user!  for  f'v-P  interconnection) 


=  1 


sjdua  1 -dup  1  ex  cro5S-~l 
I  point  J 


a  —1  1—  b 

,  G  S  dual -duplex  cross-point:  m  a;  n  o;  concurrency: 

min(m,n);  2  x  m  x  n  S.gate  J 


.6a  5 (dual -duplex;  cross-point:  radial) 


-S  (k  - 1  runl<)- 


.7  srk-trunl<;  hierarchical;  m  a;  n  b;  concurrency:! 
(m+n)   y  k  S.gate  J 


-  ^s. 

5- 

s- 

5, 

-s. 

.7a  S(k-trunk;  central;  hierarchical) 


Group   II.     Non-hierarchical   switching  for  interconnecting  a 
components  for  2-way  conversations. 


_S(duptex;  non-hierarchical) 


.8  S(duplex;  non-hierarchical;  concurrency : 1 ) 


n 


.3a  S  (dup  I  ex :  non-h  i  erarch  i  ca  1  ;  cent  ra  1 ) 


Fig.  13.  (Continued) 


Chapter  3  |  The  computer  space  69 


reduridant,  used  to  keep 
interaonneation  time 
constant 


S(dup!ex;  non-h i erarch i ca t ;  bus/chain) 


.9     S| cross-poi n t ;   non-hierarchical;  m  a;  cone 
(m-i)/2  S.gate 


urrency :m/2  J 


.9a     S (cross-poi nt :   non  hierarchical;  central) 


.gb  S 


cross-point:  non-hierarchical;  radial;  mx  (m-l)'2 
Links:  star:  all  nodes  have  linP.s  to  all  other 
nodes 


-5(trunk)  t 


1 


10  Slk-trunk;  non-hierar 
k  X  m  S.gate;  T's  r. 


chical;  m  a;  concurrency ;mi n {m/2  k) 
mau  not  be  external 


a. 

"s. 

3            L  - 

m 

'i 

'2 

■••  'k 

.lOa  S(k- 

trunk ;  cen t ra 1 ;  non 

-hi  erarch  i  ca  1 ) 

Fig.  13.  (Continued) 


Fig.  13.  (Continued) 

point  switth  provides  redundancv  and  is  used  to  form  the  lattice 
stnictnre.  To  vary  from  the  fiill-duple.x  duple.x  switch  (for 
m-memories  and  one  processor,  or  p-processors  and  one  memory) 
requires  more  components  to  be  devoted  to  the  switching,  to 
buffering,  and  to  arbitration  control.  Hence  duple.x  switches  are 
used  on  most  multiprocessor  computers.  The  processor-memory 
switching  possibilities  can  be  seen  nicely  in  Fig.  1.3.  The  im- 
portant switch  parameters  are  the  number  of  memories,  the  num- 
ber of  processors,  and  the  number  of  simultaneous  processor- 
memory  dialogues.  In  current  designs  P  alwavs  originates  the 
dialogue,  which  is  generally  taken  to  mean  the  reading  or  writ- 
ing of  a  given  word  in  .\Ip.  The  range  of  comple,xity  is  roughly 

S(null;  IM;  IP;  concurrency:  1)| 

S(simplexi  |  half-duplex-  |  full-duplex3;  (mM;  1P)|(1M;  pP); 

concurrency:  1 )  | 
S(time-multiplex  cross-point;  mM;  pP;  concurrency:  1)| 
S(cross-point;  mM;  pP;  concurrency;  min(m,p)) 

An  S.duplex  can  be  used  to  increase  the  number  of  processors 
which  can  be  connected  to  the  memory  system  while  not  having 
to  provide  additional  switch  points  on  each  memory.  For  example, 
in  the  CDC  .3600  [Casale.  1962]  a  basic  S(8M;  4P;  concur- 
rency: 4)  is  expanded  bv  placing  another  S(l.\l;  6P;  concurrency:  1) 
in  series  to  give  a  possible  overall  S(8M;  24P;  concurrency:  4). 
This  scheme  was  used  to  provide  multiple  processor  accesses  to  the 
memories. 

Processor-control  sicifcliing.  The  first  switching  problem  developed 
with  the  need  to  communicate  with  several  input  output  devices. 
This  switching  is  hierarchical  in  nature;  one  (or  two)  processors 

'A  switch  which  allows  communication  in  one  direction  between  two 
ports. 

-A  switch  which  allows  conimimication  in  either  direction  but  only  one 
direction  at  a  time. 

switch  which  allows  concurrent  communication  between  two  ports. 


Part  1     The  structure  of  computers 


maintain  control  of  many  K"s  by  giving  a  K  a  single  instruction 
task.  At  the  completion  of  the  task  the  K  signals  the  processor 
that  the  task  has  been  completed. 

The  switch  provides  a  link  between  processor  and  controls  for 
the  secondary  memory  or  the  terminals  and  is  parameterized  by 
the  number  of  processors,  the  number  of  controls,  the  number  of 
simultaneous  conversations,  and  who  originates  the  dialogue.  In 
these  switches  the  control  of  information  transmission  is  always 
by  the  processor.  The  evolution  has  been  approximately  as  follows: 

1  S(null;  IP;  IK;  concurrency:  1;  initiator:  P) 

P  and  K  are  connected  during  data  transfers. 

2  S(simplex  I  half-duplex  I  full-duplex/duplex;  IP;  IK; 
concurrencv:  1;  initiator;  P,  K) 

Each  K  operates  independently  because  it  can  return  or 
request  communication  with  P  when  control  ta.sk  is  com- 
pleted. 

.3    S(dual-duplex;  21*;  IK;  concurrency;  2;  initiator:  P,  K) 
Duplex  paths  from  dual  P's  to  each  K  for  reliability. 

4  S(cross-point;  pP;  kK;  concurrency:  min  (p,k)  initiator;  P,K) 
General  case  of  multiple  P's  and  K's  with  communication 
among  the  components. 

The  early  machines  used  the  first  structure,  and  concurrent 
operation  of  controls  was  possible  only  by  starting  several  controls 
and  by  very  carefully  programming  the  timing  for  the  data  trans- 
fers. Two  conditions  occurred  to  cause  this:  The  buffering  for  a 
T  or  an  Ms  was  associated  with  the  processor,  and  the  control 
could  not  signal  the  processor.  Although  rather  trivial  to  imple- 
ment, the  idea  (item  2  above)  of  allowing  a  K  to  signal  the  proc- 
essor did  not  occur  imtil  after  the  idea  of  arithmetic  processor 
traps  were  incorporated  into  processors.  The  interrupt  was  used 
as  the  method  by  which  a  K  communicated  its  desire  to  converse 
with  a  P.  The  early  IBM  709  provided  a  separate,  independent 
processor  for  handling  the  communication  with  input/output 
equipment.  Simultaneous  processor-to-input/output  or  secondary- 
memory  dialogues  could  take  place  (provided  the  devices  were 
connected  to  the  right  processor).  In  most  of  the  early  computers, 
part  of  the  control  function  (data  buffering)  was  associated  with 
the  Pc,  and,  as  such,  only  one  device  could  operate  at  a  time.  This 
stemmed  from  the  comparatively  high  cost  of  registers,  so  that 
links  were  established  for  a  fixed  period  of  time  during  a  com- 
plete block  transfer  of  data. 

In  some  of  the  military  computers  a  duplicate  set  of  K's  is 
provided  for  reliability.  The  more  elaborate  switching  structures 
(types  3  or  4  above)  are  rarely  used  between  Pio's  and  K's;  thus 


to  work  on  a  peripheral  requires  the  use  of  the  rest  of  the  com- 
puter. The  S.  dual-duplex  is  becoming  more  common;  it  provides 
a  method  of  off-line  operation  for  maintaining  better  component 
utilization  and  a  more  reliable  structure. 

Control-terminal  and  control-secondary-memonj  stvitching.  The 
switches  which  link  a  control  with  a  particular  terminal  or  second- 
ary memory  are  generally  fairly  straightforward.  Normally,  a  fixed 
duplex  switch  is  used.  However,  a  dual-duplex  switch  is  used  if 
multiple  access  paths  to  the  component  are  required.  The  switch 
links  a  secondary  memory  to  a  control  diuing  the  transmission 
of  relatively  long  information  units  (e.g.,  records).  A  typical  ex- 
ample of  such  a  switch  is  the  bus  structure  used  when  magnetic 
tape  units  connect  to  a  common  control.  Only  one  of  the  units 
operates  at  a  time  (although  all  can  be  rewinding  simultaneously). 
The  switches  are  far  less  interesting  than  those  above.  Because 
they  are  nearer  the  periphery,  failure  in  them  does  not  imply  a 
failure  in  the  complete  system. 

Processor  function 

The  emergence  of  complex  PMS  structures  is  coincident  with  the 
development  of  functionally  specialized  processors.  In  the  simple 
computers  of  Figs.  5  to  9  there  is  place  only  for  Pc.  In  the  general 
lattice  there  can  be  a  Pc  specialized  to  perform  no  input/output 
operations;  one  or  more  Pio's  specialized  to  communicate  with 
the  T's  and  Ms's  and  even  to  organize  information  in  Mp  for 
transshipment;  additional  Pio's  specialized  to  handle  graphic  dis- 
plays (hence  P.display);  and  even  P's  specialized  to  work  on  spe- 
cific data-types  (for  example,  P. array)  or  specific  algorithms  (e.g., 
the  fast  Fourier  transform).  In  addition,  any  of  these  processors 
may  be  realized  by  microprogramming,  which  is  to  say,  by  having 
its  ISP  interpreted  by  a  specialized  P. microprogram. 

Although  the  existence  of  various  fimctionally  specialized 
processors  is  coupled  most  closely  with  the  PMS  structure  dimen- 
sion, the  processors  themselves  are  defined  primarily  by  the  data- 
types they  can  process.  In  this  they  agree  entirely  with  the  com- 
puter-system-function dimension.  Possibly  the  processor-function 
dimension  should  be  considered  simply  an  extension  of  the  com- 
puter-system-function dimension.  On  the  other  hand,  the  inclusion 
of  microprogrammed  processors  really  extends  the  PMS  stnicture 
dimension  to  where  a  P  can  be  seen  as  a  cascade  of  two  P's. 

The  processor-function  dimension  in  the  computer  space  is  laid 
out  in  an  evolutionary  way,  so  that  its  correspondence  with  PMS 
structure  is  clear.  P. microprogram  is  put  at  the  beginning  of  the 
dimension  ahead  of  Pc,  not  because  it  occurs  earlier  in  evolu- 
tionary development,  but  because  it  extends  the  PMS  dimension 


Chapter  3  |  The  computer  space  71 


down  into  the  processor.  Anv  of  the  P's  along  the  dimension  can 
be  attained  by  a  P. microprogram. 

As  an  actual  dimension  characterizing  a  total  computer  it  must 
be  viewed  cumulatively  (similarly  to  the  data-type  dimension). 
Thus,  if  a  computer  has  a  Pio,  it  also  has  a  Pc,  and  if  it  has  a  P. array 
it  also  has  the  prior  ones.  There  are  numerous  exceptions  to  this, 
such  as  small  Pc's  with  P.displays  (hence  with  no  Pio's).  This 
evolutionary  ordering  does  not  correspond  to  complexity  or  num- 
ber of  data-tvpes  in  the  P.  Pc  and  P. array  are  the  most  complex; 
Pio  and  P.vector^move  are  least. 

We  will  make  a  few  brief  comments  on  each  fvmctional  type, 
taking  them  in  the  order  of  the  dimension. 

Microprogram  processor {P.microprogrdm ).  The  term  microprogram- 
ming was  introduced  initially  in  "The  Best  Way  to  Design  an 
Automatic  Calculating  Machine"  (W  ilkes.  195Ifl).  We  use  "micro- 
programmed" to  mean  that  an  ISP  is  defined  by  an  interpreter 
program  residing  in  an  internal  Mp,  processed  by  an  internal 
processor  (the  P.microprogram).  Thus  the  stnicture  is  really  an 
external  processor  (ISP)  being  defined  bv  the  computer  formed  as 

P  :—  Mp(internal;  read-onh  l — P.microprogram 

The  operations  that  microprogram  processors  perform  are 
primitive  in  comparison  with  other  processors.  The  task  of  the 
microprocessor  is  to  interpret  the  instructions  of  the  ISP  it  is 
realizing.  This  involves  mostly  data  transfers  among  the  registers 
of  the  processor  state  (Mps)  plus  simple  boolean  tests.  Although 
it  must  handle  all  the  data-tvpes  of  the  larger  ISP,  it  does  .so  onh' 
as  bit  fields  to  be  e.\tracted  and  transferred  from  one  register  to 
another.  The  complex  data  operations  (e.g.,  multiplication!  are 
carried  out  by  other  units  (D's).  In  fact,  if  a  complex  instmction 
set  were  to  be  used  for  the  P.microprogram,  the  external  processor 
might  as  well  be  implemented  directly  in  hardware.  In  very 
minimal  P's,  for  example,  C(PDP-S)  in  Chap.  5,  the  ISP  is  essen- 
tially already  at  the  level  of  a  microprogram  ISP,  as  shown  by  the 
inclusion  of  instmction  that  can  be  microcoded. 

The  long  lag  between  the  idea  of  microprogramming  and  its 
more  widespread  adoption  is  due  to  several  reasons.  Early  ISP's 
were  comparatively  straightforward,  so  that  a  microprogram  ap- 
proach was  not  economically  justified.  The  interpretation  overhead 
time  is  higher  than  with  the  hardwired  approach,  and  unless 
complex  functions  are  realized  this  time  becomes  objectionable. 
In  addition,  suitable  read-only  memories  were  not  developed  until 
the  mid  196()s  (though  it  is  unclear  whether  this  is  cause  or  effect). 
An  additional  feature  of  using  a  P.microprogram  is  the  ability  to 


realize  several  ISP's  within  a  single  physical  processor.  IBM  has 
exploited  this  feature  extensively  in  the  System/36()  (Part  6, 
Sec.  3),  which  is  b\'  far  the  most  ambitious  use  of  microprogram- 
ming. One  can  argue  that  without  the  additional  payoff,  which 
was  used  to  ease  the  transition  to  a  new  incompatible  computer 
system  by  providing  emulation  of  the  old  s\stem,  the  micropro- 
gramming would  be  marginal. 

Several  P.microprogram  design  approaches  have  emerged: 
Kampe  (Chap.  291  presents  a  design  based  on  a  short  word;  the 
internal  processor  is  very  much  like  a  conventional  processor.  At 
the  other  extreme,  the  IBM  System  -360  (Chap.  32)  is  based  on 
a  long  word  which  allows  multiple  operations  to  be  coded  in 
parallel.  (The  parallel  operations  are  necessary  to  gain  an  accept- 
able performance  level.)  Thompson  Ramo  Wooldridge  called  their 
.\.\  UYK  a  "stored  logic  "  computer,  and  it  provided  the  ability 
to  use  primary  memory  for  defining  the  ISP.  The  IBM  System/360 
Model  25  page  .567!  also  uses  this  approach.  The  llevylett-Packard 
desk  calculator  (Chap.  20)  shows  the  use  of  microprogramming 
on  a  relativeh'  circumscribed,  but  complex,  task. 

Central  processors  (Pc).  These  processors  interpret  an  instmction 
set  for  manipulating  arithmetic,  logical,  and  symbolic  data-types. 
In  all  simple  systems  it  is  the  only  proce.ssor  and  thus  does  all 
tasks.  The  growth  of  processor  specialization  can  be  described  in 
terms  of  relieving  the  Pc  of  simpler  functions  that  require  sub- 
stantial processing  time  but  do  not  make  full  use  of  the  devices 
within  the  Pc,  such  as  the  arithmetic  units.  Cmcial  to  this  issue 
is  the  time  it  takes  the  Pc  to  switch  from  one  task  to  another  (recall 
the  discussion  on  Mps,  the  processor  state),  since  many  of  the  jobs 
that  are  extracted  to  specialized  processors  are  demand  jobs,  such 
as  input  output. 

With  the  removal  of  tasks  from  the  Pc,  it  becomes  more  spe- 
cialized. .\  very  pure  example  of  this  is  the  Pc  of  the  CDC  6600 
(Chap.  39).  which  has  no  input  output  instmctions  of  an\  kind 
in  the  Pc.  That  is.  not  only  has  the  control  and  management  of 
communication  and  transmission  with  the  T's  and  .Ms's  been  re- 
moved from  the  Pc,  but  the  act  of  initiation  has  been  removed 
as  well  and  placed  in  the  Pio's.  Thus,  the  6600  Pc  is  just  an 
engine  for  working  on  the  arithmetic,  logical,  and  s\mbolic  (ad- 
dress) data-t\pes. 

The  mi-xture  of  operations  to  be  performed  in  most  complex 
algorithms  prevents  specialization  of  the  Pc  from  going  verv  far, 
e.g.,  from  there  being  a  P. arithmetic,  for  with  every  switch  be- 
tween capabilities  distributed  in  distinct  P's  there  must  be  inter- 
communication of  the  components,  which  introduces  an  overhead 
cost  in  processing  time. 


72  Part  1  j  The  structure  of  computers 


Input/output  processors  {Pio).  The  Pio  specializes  in  the  manage- 
ment of  peripherals  (secondary  memories  and  terminals).  They  are 
also  called  peripheral  processors,  data  channels,  and  channels.^ 
The  tasks  a  Pio  and  its  subordinate  peripherals  perform  are  the 
transmission  of  information  between  Ms  and  Mp;  the  transmission 
of  information  between  some  extra  computer  real-time  system 
(e.g.,  human);  and  the  transmission  of  information  outside  the  C, 
via  a  T  to  some  other  information  media  (e.g.,  a  card  reader,  card 
punch,  line  printer,  etc.).  All  the  above  tasks  are  similar  and  often 
are  considered  the  same,  though  in  principle  they  can  be  quite 
different.  A  task  in  this  environment  is  the  management  of  some 
quanta  of  information,  whether  it  be  one  bit  or  character,  a  voice 
message,  or  a  record  or  file  from  magnetic  disk  or  magnetic  tape. 
Thus  a  Pio  does  not  usuallv  change  anv  information;  it  is  merelv 
an  interpreter  for  moving  information.  There  are  three  exceptions: 
Computation  is  required  for  error  and  correction  and/or  detection; 
computation  is  required  if  receding  and  reformatting  are  done; 
and  computation  is  required  when  search  operations  are  carried 
out  on  Ms  without  Pc  intervention. 

To  accomplish  the  above  tasks  requires  a  fairlv  simple  instruc- 
tion set.  Typicallv  it  contains  jump  (branch);  data  transmission 
within  Mp  to  initialize  process  variables;  simple  counting  abilitv, 
e.g.,  to  control  error  retries;  subroutine  calling;  interrupt  process 
handling;  initializing  KMs  or  KT;  testing  the  state  of  KMs  or  KT; 
and  sometimes  code  conversion  (data  in  one  code  format  is  con- 
verted to  another  code).  Thus  substantial  arithmetic  and  logic 
facility  is  not  needed.  Part  4,  Sec.  1  provides  a  detailed  discussion 
of  Pio's. 

Display  processors  (P.cUspIai/).  The  P.displav  is  a  complex  Pio  that 
processes  information  for  display  terminals.  The  data-type  is  a 
representation  of  a  complex  graphic  object,  e.g.,  lines,  points, 
curves,  and  spatially  localized  text.  The  representations  vary  con- 
siderably from  system  to  system,  using  various  list  pointers  and 
vector  encodings.  The  operations  on  the  data-types  include  the 
maintenance  of  the  display  (due  to  the  short-term  persistence  of 
the  CRT);  the  selective  modification  of  the  representation  under 
commands  from  the  T. display  or  the  Pc,  such  as  adding  or  deleting 
a  line,  inserting  text,  etc.;  the  control  of  T. inputs  such  as  key- 
boards, light  pens,  joysticks;  and  the  performance  of  more  complex 
spatial  transformations,  such  as  translation,  rotation,  scale  change, 
and  determination  of  hidden  lines. 

'These  terms  are  usually  used  without  distinguishing  between  a  Pio  and 
a  Kio,  that  is,  whether  the  device  interprets  a  sequential  progi'am  (and 
thus  is  capable  of  sustained  independent  activity)  or  only  decodes  a  single 
instruction. 


The  P.display  is  a  good  example  of  a  highly  complex  but  spe- 
cialized data-type  for  which  there  are  substantial  local  operations 
to  perform,  that  is,  where  no  interaction  is  needed  with  a  complex 
algorithm  (that  requires  the  Pc).  Users  of  displays  wish  to  correct, 
modify,  and  transform  the  display  in  geometrically  simple  ways 
(in  effect,  edit  and  view)  between  processing  of  the  graphic  infor- 
mation by  complex  algorithms.  Thus  the  graphic  display  is  a  prime 
candidate  for  the  development  of  a  specialized  processor. 

The  DEC  338  (Chap.  25)  is  typical  of  these  processors,  being 
neither  the  simplest  nor  the  most  complex  (e.g.,  it  does  not  have 
rotation  or  hidden  line  elimination  instructions). 

Array  processors  (P.array).  The  array  processor  might  be  considered 
a  more  general  Pc.  It  has  been  proposed  or  discussed  in  the  litera- 
ture for  some  time.  (See  bibliography  for  Chap.  27,  page  329.)  The 
information  imit  processed  is  an  array  of  one  (vector)  or  two 
(matrix)  dimensions.  Instructions  are  provided  to  operate  on  these 
data.  The  specification  of  algorithms  for  a  P.array  is  based  on  the 
assumption  that  an  operation  can  be  carried  out  in  parallel  for 
array  elements.  Actually,  both  serial  (sequential)  and  parallel 
(concurrent)  execution  can  be  implemented.  Both  stmctures  have 
the  same  logical  characteristics,  from  an  ISP  viewpoint,  and  may 
differ  only  in  execution  rate.  The  three  array  processors,  ILLIAC 
IV  (Chap.  27),  NOVA  (Chap.  26),  and  the  IBM  2938  (page  577), 
are  discussed  in  Part  4,  Sec.  2  (page  315). 

Vector-move  processors.  The  vector-move  processor  is  a  special-case 
P.array.  It  is  capable  only  of  moving  a  word  vector  at  some  loca- 
tion in  Mp  to  some  other  location  within  Mp.  Because  of  its  limited 
instmction  set,  such  a  P  is  found  only  in  computers  which  require 
constant  Mp  shuffling.  This  condition  arises  either  because  of  a 
hierarchy  of  Mp  speeds  or  because  the  programs  must  have  a 
particular  structure  before  they  can  be  interpreted  by  the  proc- 
essor. A  time-shared  computer  might  require  such  a  processor  for 
multiprogram  memory  management.  It  is  therefore  common  to  find 
block  (vector)  transmission  instructions  in  a  Pc.  The  IBM  Sys- 
tem/36()  has  Pio(Storage  channel)  for  this  function  (page  577). 

Special  algorithm  processors  (P.algoritlim).  Only  a  small  number 
of  special  algorithm  processors  have  been  specified  and/or  imple- 
mented. High  performance  is  almost  guaranteed  by  hardwiring  and 
through  specialization.  The  time  to  fetch  the  algorithm  (instruc- 
tion fetch  time)  and  many  of  the  references  to  Mp  for  temporary 
data  are  eliminated  by  hardwiring.  A  hardwired  algorithm  can 
easily  outperform  a  stored  program  by  a  factor  of  10  ~  100.  The 
lack  of  these  processors  in  systems  stems  mainly  from  lack  of 
market  demand. 


Chapter  3     The  computer  space  73 


It  is  not  clear  that  the  special  algorithm  processors  meet  our 
criteria  for  being  a  processor,  because  of  the  rather  limited  fimc- 
tions  thev  perform.  In  fact,  some  so-called  processors  are  just  K's, 
or  D's  since  they  have  no  instruction  location  counter  and  inter- 
pret only  a  single  instruction  at  a  time,  requesting  each  new 
instruction  from  a  superior  component. 

Algorithms  which  have  been  hardwired  (or  proposed)  include 
the  fast  Fourier  transform  using  the  Cooley-Tukev  algorithm; 
cross-correlation,  autocorrelation,  and  convolution  processing; 
polynomial  and  power-series  evaluation;  floating-point  array 
processing;  and  neural  network  simulation.' 

Language  processors  (P. language).  Language  P's  interpret  a  lan- 
guage that  has  been  designed  to  some  external  criteria,  such  as 
a  procedure-oriented  language  (ALGOL  or  FORTRAN)  or  a  list 
language  (IPL-VI).  Thus  complexity  takes  the  form  of  a  complex 
data-type  for  the  "instniction,"  rather  than  a  complex  data-tvpe 
for  processing  (e.g.,  floating  complex  numbers).  If  such  processors 
were  extended  to  do  all  the  things  a  Pc  also  does,  then  they  would 
become  more  complex  than  a  Pc.  However,  to  date,  most  of  them 
are  experimental  and  focus  exclusively  on  language  interpretation. 

In  Part  4,  Sec.  4,  several  examples  are  presented.  It  is  worthy 
of  note  that  of  the  three  P. languages  only  El'LER  (chap.  .32i  has 
been  implemented  in  hardware  using  a  P. microprogram. 

Memory  access 

The  most  useful  classification  of  memories  is  according  to  their 
accessing  algorithm.-  These  are  queue  (i.e.,  access  according  to 
first-in-first-out  discipline);  stack  (i.e.,  access  according  to  first- 
in-last-out  discipline);  linear  (e.g.,  a  tape  with  forward  read  and 
rewind);  bilinear  (e.g.,  a  tape  with  forward  and  backward  read); 
cyclic  (e.g.,  a  drum);  random  (e.g.,  core);  and  content  and  associa- 
tive. All  these  memories  are  e.xplicitlv  addressed  except  the  stack 
and  queue,  which  deliver  an  implicitly  specified  i-imit  on  each 
read. 

Memory  size  and  basic  operation  times  (i.e.,  the  time  constants 
in  the  access  algorithm)  are  important  too,  of  course.  But  once 
a  distinction  is  made  between  Mp  and  Ms.  then  for  anv  given 
technological  era  there  have  existed  characteristic  sizes  and  speeds 

'Chasm:  A  Macromodular  Computer  for  Analog  Neuron  Models  [Molnar. 
1967]. 

^Access  for  writing  should  be  distinguished  from  access  for  reading.  Mem- 
ories are  conceivable  with  arbitrarily  different  read  and  write  access  algo- 
rithms (e.g.,  random  read  and  cyclic  write).  However,  in  general,  the  two 
access  algorithms  are  tightly  coupled,  and  nornialK  onl\  the  read  access 
algorithm  is  given. 


for  memories  of  a  specified  access  algorithm.  \Miere  there  has 
been  variation,  either  it  has  been  linear  with  size  (e.g.,  buying 
two  boxes  of  magnetic  core  Mp  versus  buying  one)  or  there  has 
been  a  narrow  range  of  cost/performance  tradeofl^  (as  in  data  rate 
for  magnetic  tapes,  in  which  modest  increases  in  density  and  tape 
speed  can  be  bought  for  substantially  increased  dollars).  Table  5 
shows  the  relative  price,  size,  and  performance  of  various  mem- 
ories. The  memory-size  versus  information-rate  plot  (Fig.  14)  shows 
the  clustering  of  memories  and  their  suitability  for  a  particular 
function. 

From  a  technolog)'  standpoint,  .Mp's  have  been  constrained  to 
either  cyclic-  or  random-access  memories  (although  one  can  easily 
constmct  any  type  from  random-access  memories).  In  Part  2,  Sec.  I 
we  have  not  separated  the  machines  according  to  whether  they 
used  cyclic-  or  random-access  memories.  The  early  first-generation 
computers  used  cvclic-access  memories.  Part  3,  Sec.  2  presents 
only  the  cvclic-access  memories. 

Similarly,  .Ms's  have  been  constrained  to  be  cvclic  or  linear, 
although  quasi-random  access  has  been  achieved  with  some  disks 
and  magnetic-card  memories  (random  by  block  and  linear  or  cyclic 
within  a  block).  .\ny  Ms  s  can  be  part  of  almost  any  computer 
stnicture.  Thus  there  is  no  large  effect  of  Ms  structure  on  the  main 
design  features  of  computer  systems,  and  they  are  not  discussed 
to  any  extent  in  the  remainder  of  the  book.  Our  discussion  of 
memory  type  below  deals  exclusively  with  Mp  and  Mps. 

Stack  and  queue  memories  (Xf. stack,  M.queue).  Data  elements  in 
a  stack  and  queue  are  not  accessed  e.xplicitly,  as  we  noted  above. 
The  stack  has  some  rather  unique  properties  that  aid  in  the  com- 
pilation and  evaluation  of  nested  arithmetic  expressions.  Although 
there  are  no  machines  employing  stacks  exclusively  for  primary 
memory,  there  are  stacks  in  some  arithmetic  processors.  Part  3, 
Sec.  5  is  devoted  to  processors  with  stack  memories  (i.e.,  with 
stacks  in  the  processor  state). 

The  IPL-VI  machine  (Chap.  30)  is  the  only  computer  in  the 
book  to  have  its  entire  memory  organized  as  a  list  of  stacks. 
Although  no  hardware  exists  that  inherenthi  behaves  as  a  stack 
or  queue. '  it  can  be  simulated  by  a  random-access  memory.  A  shift 
register  capable  of  shifting  in  either  of  two  directions  is  a  stack. 

Cyclic-access  memories  (Mp.cyclic).  Nearly  all  the  first-generation 
(vacuum  tube)  computers  had  Mp. cvclic.  The  Mp. cvclic  acoustic, 
magnetostrictive  delay  line,  and  magnetic  drum  provided  an  in- 

Small  (10  1,000  word)  queue-  and  stack-accessed  memories  are  espe- 
ciall)'  eas)"  to  build  with  large-scale  integrated-circuit  technolog>'. 


Part  1     The  structure  of  computers 


Table  5    Memory  characteristics 


Memory  size 


Meinonj  perfomumce 


Module 


Modules/  Access 


Data 


Access 

size 

computer 

time 

rate 

Memory  module 

Function 

method 

(bits) 

sec 

(bits/sec) 

Cost/bifi$)^ 

Punched  paper  card 

permanent, 
archival 

random  + 
linear 

tDUU  —  1  ,UUU)/ 

card;  — 

1,000  card  unit 

1—2 

10"  —  10-^ 

2  X  10-6  _|_ 
2  X  10-1 

Magnetic  card 

secondary, 
archival 

linear  + 
constant  + 
cyclic 

3  X  109 

1  ~4 

10-1  _  10" 

0.4  X  106 

1.5  X  10-8  _,_ 
5  X  10-5 

Magnetic  tape 

secondary, 
archival 

linear 

2  X  108 

1  ~  16 

10"  --  10- 

0.4  -  4  X  106 

2  X  10-'  + 
10-1 

Moving-head  disk  pack 

secondary, 
files  swapping 

linear  + 
cyclic 

2  X  10* 

1  ^  Id 

10-1  100 

2.5  X  106 

3  X  10"6  + 
lO"* 

Fixed-head  disk 

secondary, 
files  swapping 

cyclic 

5  X  10' 

1  ~~  40 

--10-2 

106  _  107 

10-3 

Drum 

secondary, 
swapping 

cyclic 

(1  —  5)  X  10' 

1  ^  10 

(5  —  30)  X  10--3 

106  ^  107 

10-3 

Bulk  core  memory 

primary  and/or 
secondary, 
swapping 

random 

10' 

1  Q 

(Z  —  iU)  X  10 

10''  —  10 

U.U^  —  U.Uo 

High-speed  core  or 

primary 

random 

105  _  106 

1  -  15 

(0.2  -  2)  X  10-'' 

10'  -  10* 

0.05  -  0.25 

thin-film  memory 

Integrated  circuit 

primary. 

random 

10-3  ~  105 

1 

-10-' 

109 

0.25  -  1.0 

(scratch-pad  memory) 

processor 
state 

Integrated  circuit 

primary,  cache 

content. 

2  X  105 

1  -  2 

-10-' 

lOi* 

1  -  3 

(content  addressable) 

random 

Read  only 

processor 

random 

(1  ~  5)  X  105 

1 

10-6  _  10-7 

108  _  109 

10-3  _  10-2 

(capacitor,  inductor) 

instruction-set 
definition 

'The  first  component  is  the  memory  media  (e.g.,  a  disk  pack),  and  the  second  component  is  the  transducer  (e.g.,  a  disk  drive). 


expensive,  simple,  producible  memory.  By  the  second  generation 
the  cost  of  Mp. random  (though  still  more  expensive  than  an 
Mp. cyclic)  was  about  equal  to  the  processor  logic.  The  incremental 
cost  for  an  Mp, random  in  a  large  system  was  then  small,  whereas 
the  performance  gain  could  be  a  factor  of  up  to  3,000  (access  time 
of  10  microseconds  versus  30  ~  30,000  microseconds).  Some  of  the 
first-generation  machines  were  reimplemented  using  transistors 
(the  LGP-30  became  the  LGP-21).  Only  a  few  new  cyclic 
access  machines  were  introduced  in  the  second  generation.  Most 
notable  was  the  low-cost  Packard-Bell  PB-250  using  transistor  logic 
and  niagnetostrictive  delay  lines  (a  derivative  of  the  Bendix  G-15 
and  NPL  ACE). 

Nearly  all  these  computers  use  some  form  of  n  -I-  1  addressing. 


The  memory  is  organized  on  a  digit-by-digit  serial  basis  for  a  word 
(e.g.,  ZEBRA  with  binary  and  IBM  650  with  decimal).  Hence,  the 
arithmetic  or  logic  fiuiction  hardware  is  implemented  for  onlv  a 
single  digit.  An  operation  is  done  for  the  entire  word  by  iterating 
over  all  digits  in  time;  thus  the  cost  of  a  serial  computer  is  nearly 
independent  of  its  word  length. 

Because  of  the  cyclic  and  synchronous  nature  of  these  Mp's, 
it  is  difficult  to  synchronize  them  with  secondary  memories  and 
terminals  (which  are  also  synchronous).  The  very  early  machines 
had  no  large  secondary  memories.  In  some  cases,  where  magnetic 
tape  was  used,  it  was  added  at  very  low  performance  (low  density, 
low  speed,  and,  therefore,  low  data  rates)  so  that  synchronization 
was  not  a  problem.  In  other  cases  a  small  random-access  core 


Chapter  3  |  The  computer  space  75 


memor\'  was  added  to  provide  svnchronization  between  the  two 
memories  (for  example,  IBM  650). 

Random-access  memories  (Mp. random).  Random-access  memories 
were  used  late  in  the  first  generation,  and  thev  have  remained 
the  predominant  memorv  during  the  second  and  third  generations. 
It  is  unlikelv  that  their  popularitv  will  decline  unless  content- 
addressable  memories  can  be  constructed  sufficientlv  cheaplv  (if 
then).  The  earliest  first-generation  random-access  memories  were 


electrostatic  and  depended  on  maintaining  a  charge  on  plates  of 
an  arrav  of  capacitors.  The  most  common  was  the  Williams  tube 
(invented  bv  F.  H.  Williams  at  the  University  of  Manchester) 
which  works  in  essence  like  a  CRT,  with  the  beam  used  to  charge 
a  capacitor  arrav  at  the  tube  face  [Williams  and  Kilburn.  1949], 
Other  schemes  included  an  array  of  capacitors  which  were  selected 
bv  digital  logic  (Pilot.  Chap.  3.5). 

Late  in  the  first  generation  Forrester  [19.51]  invented  the  core 
memorv,  which  rapidh'  became  the  predominant  priniarv-memorv 


Part  I  I  The  structure  of  computers 


component.  It  is  unlikely  that  it  will  be  replaced  in  the  near 
future;  the  most  likely  candidate  is  large-scale  integrated-circuit 
arrays  of  flip-flops. 

The  random-access  memory  seems  nearly  perfect  for  the  Mp's 
of  present  computers.  Of  course,  enthusiasm  for  this  memory  may 
be  based  on  not  knowing  how  computers  would  have  developed 
if  we  had  not  had  them.  However,  with  little  or  no  effort  an 
M. random  can  be  a  stack,  a  queue,  a  linear,  a  cyclic,  and  even 
(within  limits)  a  content  or  associative  memory.  It  is  an  organiza- 
tion which  is  very  hard  to  beat. 

Content-addressable  and  associative  memories.  It  is  possible  to 
conceive  of  many  e.xotic  accessing  capabilities,  and  numerous 
proposals  have  been  made  involving  either  theoretical  structures 
or  experimental  prototypes.  Since  no  particular  varieties  have 
become  widespread,  terminology  is  still  variable.  Content- 
addressable  memories  are  usually  taken  to  mean  a  collection  of 
cells  of  predetermined  size  (i.e.,  a  fixed  i-unit)  such  that  if  one 
presents  as  "address"  the  contents  of  a  predetermined  part  of  the 
cell  (the  tag  or  content  address)  then  tlie  contents  of  the  entire 
cell  will  be  retrieved.  An  associative  memory  is  usually  taken  to 
mean  a  system  such  that,  when  presented  with  an  item  of  informa- 
tion, it  delivers  one  or  more  "associated"  items  of  information. 
The  principle  of  association  is  variable,  yielding  different  kinds 
of  associative  memories.  Content-addressable  memories  provide 
a  form  of  association,  as  do  all  memories,  in  fact.  Thus  the  term 
"associative  memory"  tends  to  denote  forms  of  association  different 
from  familiar  ones — forms  that  presumably  have  less  sharp  con- 
straints imposed  by  the  structure  of  memory  (as  opposed  to  the 
structure  of  the  information  in  the  memory). 

No  examples  exist  of  a  computer  with  a  content-addressable 
memory  as  its  primary-memory  stnicture.  However,  both  the  IBM 
360  Models  67  (page  571)  and  Model  85  (page  574)  use  8  and 
—  1,000-word  content-addressable  memories,  respectively,  to  in- 
crease performance  (in  both  cases  they  are  transparent  to  the 
program).  The  CDC  6600  instruction  buffer  is  in  effect  a  small 
content-addressable  memory.  In  the  above  three  cases,  the  con- 
tent-addressable memories  vary  in  size  and  position  in  the  struc- 
ture; however,  the  pattern  of  use  is  common.  There  is  a  large  but 
slower  Mp.random  behind  the  content-addressable  memory.  The 
purpose  of  the  fast  small  content-addressable  memory  is  to  hold 
local,  current  data  so  that  an  access  will  not  have  to  be  made  to 
the  random-access  memory. 

Small  prototype  associative  addressable  M  s  have  been  con- 
structed, but  they  are  normally  based  on  random-access  memories 
under  the  control  of  special  hardware.  There  are  immediate  uses 


for  content-addressable  memories  with  a  large  information-content 
address.  For  example,  the  read-only  memories  for  microprogram 
processors  use  long  words  principally  because  content-addressable 
memories  are  not  available.  Ideally  a  microprogrammed  processor 
would  like  to  look  at  a  fairly  large  processor  state  to  determine 
what  action  is  to  be  taken  in  the  microprogram.  It  is  interesting 
to  speculate  about  the  evolution  of  computers  if  a  content- 
addressable  memory  had  been  developed  in  place  of  the  random- 
access  memory. 

Mp  concurrency 

Multiprogramming  is  the  simultaneous  existence  of  multiple, 
independent  programs  within  Mp  being  processed  sequentially  or 
in  parallel  by  one  or  more  processors.  Multiprogramming  provides 
each  user  program  with  a  memory  space  independent  of  other 
users.  It  may  provide,  in  addition,  the  sharing  by  several  users  (for 
independent  use,  not  for  communication)  of  a  block  of  Mp,  which 
thus  does  not  have  to  be  duplicated.  For  example,  operating  sys- 
tems software,  including  compilers,  assemblers,  loaders,  and  edi- 
tors, can  be  usefully  shared. 

The  ability  to  have  multiple  programs  gives  rise  to  a  corre- 
sponding problem  of  communication  between  programs.  We  have 
defined  this  as  a  correlated  dimension  in  the  computer  space 
(interprogram  communication)  and  will  discuss  it  in  the  next  sec- 
tion. The  issues  it  raises  are  just  the  opposite  from  those  raised 
by  the  requirement  for  multiple  programs,  which  are  discussed 
in  this  section.  Here  we  are  concerned  with  protecting  one  pro- 
gram from  another — with  assuring  that  no  unjustified  communica- 
tion will  occur — and  with  obtaining  appropriate  space  in  Mp  so 
that  multiple  programs  can  run. 

The  requirement  for  protection  is  obvious.  If  two  independent 
programs  are  to  be  resident  in  Mp  at  the  same  time,  they  must 
not  have  access  to  each  other's  space.  Not  only  would  such  access 
(especially  for  writing)  have  disastrous  consequences  when  the 
programs  are  nmning,  but  they  would  be  entirely  unpredictable 
and  undebuggable  from  the  viewpoint  of  the  programmer  of  each 
individual  program.  Thus  this  requirement  is  absolute;  i.e.,  it  must 
be  highly  reliable.  This  implies  a  hardware  solution,  although 
purely  software  schemes  are  possible  in  special  cases. 

The  requirement  for  appropriate  space  is  somewhat  more  sub- 
tle. Certainly  there  must  be  enough  space  in  Mp  for  all  the  pro- 
grams that  are  to  be  resident  simultaneously.  It  must  be  possible 
to  find  that  space,  assign  it  to  a  new  program,  and  make  it  available 
again  when  that  program  is  finished.  But  what  kind  of  space  will 
do?  Must  it  be  a  single  interval  of  Mp,  large  enough  for  the  total 
program  with  data?  And  if  the  program  is  assembled  or  compiled 


Chapter  3  |  The  computer  space  77 


in  Mp  and  is  removed  temporarily  to  make  room  for  another 
program,  must  it  be  brought  back  into  the  exact  same  addresses 
into  which  it  was  originally  assembled? 

The  key  issue  resides  in  the  kind  of  intercommunications  that 
hold  within  a  program  and  its  data,  for  these  determine  how  and 
in  what  way  a  program  is  interconnected  and  depends  on  the 
specific  Mp  addresses  that  it  occupies.  These  connections  are  of 
two  kinds:  explicit  addresses  present  in  the  program  and  data  and 
implict  relations  between  addresses  due  to  addressing  algorithms 
(e.g.,  that  programs  are  laid  sequentially  in  Mp,  or  that  the  ele- 
ments of  an  array  are  to  be  accessed  by  indexing  and  hence  must 
occupy  consecutive  addresses).  Again,  although  some  purely  soft- 
ware solutions  to  the  space  issue  e.vist,  hardware  is  involved  in 
a  fimdamental  way. 

Thus,  the  two  main  questions  of  program  concurrency' — 
protection  and  space  assignment — implv  basic  design  features  of 
a  computer  system.  It  might  seem  that  they  imply  separate  fea- 
tures and  should  be  separate  dimensions  in  the  computer  space. 
In  fact,  each  proposal  for  how  to  solve  the  space-assignment  prob- 
lem also  contains  a  particular  proposal  for  the  protection  problem. 
Thus  we  treat  them  as  a  single  dimension. 

Virtual-address  spare  and  mapping.  Before  considering  various 
solutions  to  Mp  concurrency  (i.e.,  the  values  along  the  dimension), 
let  us  introduce  two  concepts  in  terms  of  which  all  current  solu- 
tions can  be  understood.  Consider  a  particular  program,  PRO- 
GRAM-1,  one  of  manv  that  might  wish  to  reside  in  the  Mp.  PRO- 
GRAM-I  assumes  a  set  of  addresses,  some  explicitly  and  some 
implicitly,  in  the  addressing  algorithm  it  uses.  PROGR.\M-l  re- 
quires a  memorv  space  that  has  addresses  that  satisf\'  all  these 
requirements,  the  implicit  and  explicit  ones.  Other  than  that  it 
does  not  care  how  these  addresses  are  realized.  Let  us  call  this 
address  space  required  by  PROGRAM-I  its  virtual  memory,  Mv. 
Thus,  each  program  has  its  own  virtual  memory.  (You  might  think 
of  this  as  having  its  own  Mp,  except,  as  we  shall  see,  this  Mp  mav 
be  many  times  bigger  than  anv  actual  Mp  and  still  be  entirely 
feasible.) 

Actually  to  run  PROGRAM-1  requires  that  it  be  placed  in  the 
real  Mp  in  such  a  wav  that  the  real  addresses  of  Mp  containing 
it  satisfy  all  the  requirements,  that  is,  that  it  be  a  faithful  image 
of  the  virtual  memory.  Thus  there  must  be  some  memory  mapping 
that  maps  the  actual  addresses  into  the  actual  memorv.  Once 
PROGRAM-1  is  placed  in  Mp  there  must  be  some  process  that 
takes  each  virtual  address  (as  it  occurs  to  be  processed  in  an 

'See  also  Randell  and  Kuehner  [1968]. 


instruction)  and  finds  the  actual  address  in  Mp,  so  that  the  correct 
contents  can  be  obtained. 

This  might  seem  simplv  a  complicated  and  abstract  way  to  view 
matters,  but  it  becomes  essential  as  soon  as  we  realize  that  the 
computer  can  have  hardware  memorv  mappings  other  than  the 
familiar  direct-addressing  structure  of  Mp.  Furthermore,  if  this 
mapping  is  given  the  right  properties,  it  may  solve  some  of  the 
space-assignment  and  protection  problems  for  Mp  concurrency. 
What  we  have  really  done  is  to  divorce  the  addressing  required 
by  the  programs  from  that  provided  by  the  physical  computer, 
so  that  we  can  redesign  it  (via  the  memory  mapping)  to  meet  new 
design  requirements  that  were  not  apparent  when  the  original 
random-addressing  schemes  were  created. 

Let  us  make  the  notion  of  memory  mapping  more  precise.  The 
program  contains  virtual  addresses,  z  (that  is,  s\mbols  in  the  pro- 
gram that  denote  addresses  are  taken  to  denote  addresses  in  Mv). 
During  the  execution  of  the  program,  whenever  there  is  a  refer- 
ence to  an  address  z  (either  explicitly  via  an  address  calculation 
or  implicitly  via,  say,  getting  the  next  instruction),  a  computation 
occurs  on  z  to  obtain  the  actual  address  in  Mp.  This  computation 
is  part  of  the  Pc,  just  as  is  an  automatic  indexing  or  indirect- 
addressing  calculation.  It  takes  as  input  not  just  the  virtual  address 
z  but  information  on  where  the  program  is  located  in  Mp.  The 
latter  information  is  called  the  map,  and  a  program's  map  infor- 
mation is  determined  when  it  is  placed  into  Mp  on  a  given  run. 
Thus,  using  our  ISP  notation,  and  calling  the  address  calculation 
f,  we  get 

Mv[z]  :=  Mp[f(z,map)] 

That  is,  the  information  in  virtual  memory  at  virtual  address  z 
is  the  same  as  the  information  in  actual  memory  at  address 
f(z,map). 

This  whole  scheme  is  built  to  permit  programs  to  be  placed 
in  Mp's  in  various  wavs,  e.g.,  relocated  or  scattered  around,  and 
still  make  it  possible  to  nui  the  program.  Anv  such  scheme  brings 
a  solution  to  the  protection  problem,  namelv,  that  for  some  values 
of  z  the  above  calculation  cannot  take  place  or  is  invalid  i  i.e..  there 
is  no  mapping  for  z).  This  can  correspond  to  a  violation  of  protec- 
tion, which  can  then  be  prevented.  All  calculations  may  even  be 
permissible,  but  f  is  so  arranged  that  it  never  produces  an  address 
in  anyone  else's  part  of  Mp. 

The  memory  map  is  part  of  each  user's  program.  W  ith  many 
users,  it  must  reside  in  Mp,  since  there  will  not  be  enough  space  in 
Mps  to  hold  a  large  amount  of  mapping  information.  However, 
when  a  program  is  being  executed,  some  part  of  the  mapping 
information  becomes  part  of  the  Mps  (i.e.,  at  least  the  Mp  address 


78  Part  1  j  The  structure  of  computers 


of  the  rest  of  the  map).  In  addition,  the  map  may  contain  special 
access  control  information,  such  as  whether  a  part  may  be  read, 
read  as  data,  written,  or  read  as  program.  The  map  can  also  collect 
statistical  information  concerning  whether  a  part  of  the  program 
has  been  used  or  has  been  changed  (written). 

Random-access  memories  for  Mp  constrain  the  mapping  by 
requiring  linear  addresses  of  the  form  Mp[0:p],  since  the  mapping 
calculation  must  be  economical  (as  it  is  performed  with  very  high 
frequency).  We  would  not  consider  a  map  structure  which  provides 
every  word  in  Mv  to  be  mapped  into  an  arbitrary  word  in  Mp, 
for  this  would  require  a  map  exactly  the  same  size  as  Mv.  With 
manv  programs  in  Mp,  there  would  be  little  room  for  anything 
but  maps.  Similarly,  the  amount  of  processing  in  f,  the  calculation, 
must  be  very  minimal.  These  two  aspects  constrain  the  mapping 
scheme  strongly. 

The  constraint  to  linear  addresses  appears  to  force  the  structure 
of  virtual  memory  to  consist  of  a  multidimensional  array.  This  can 


be  one-dimensional,  Mv[0:n],  or  two-dimensional,  Mv[0:s][0:m].  It 
could  be  of  higher  dimension,  but  the  need  seems  not  to  have  been 
felt  (since  within  any  single  dimension  one  can  have  multi- 
dimensional arrays  as  one  normally  does  in  a  regular  Mp).  How- 
ever, the  two-dimensional  array,  which  also  is  called  segmented 
addresses,  since  it  can  be  taken  as  a  discrete  collection  of  s  -I-  1 
segments  each  of  m  +  1  linear  addresses,  has  advantages  in  terms 
of  the  mappings;  namely,  segments  can  be  placed  disjointly  in  Mp 
without  fear  that  virtual-address  calculations  will  cross  from  one 
segment  to  another. 

With  this  introduction  to  the  problems  of  multiprogramming 
we  will  look  at  some  of  the  hardware  schemes.  Table  6  provides 
a  summarization  of  them,  including  a  brief  description  of  how  each 
scheme  operates. 

No  special  mappina  hardware.  If  no  hardware  exists  in  the  Pc  to 
accomplish  a  memory  address  mapping,  then  when  the  address 


Table  6    Memory-allocation  methods 


{arranged  in  order  of  iucrarsiiiti 
hardware  coinplcxitt/) 


Melliod  of  memory  allocation 
multiple  users 


Limits  of  particular 
method  (example  of  use) 


No  relocation  Mv  <  Mp: 

Conventional  computer— no  memory-al- 
location hardware 


1-1-1  users.  Protection  bit  for  each 
memory  cell 


1-1-1   users.  Protection  bit  for  each 
memory  page. 

Page-locked  memory 


No  special  hardware.  Completely  done  by  inter- 
pretive programming. 


A  protection  bit  Is  added  to  each  memory  cell. 
The  bit  specifies  whether  the  cell  can  be 
written  or  accessed. 


A  protection  bit  is  added  for  each  page,  (See 
above  scheme.) 

Each  block  of  memory  has  a  user  number  which 
must  coincide  with  the  currently  active  user 
number. 


Completely  interpretive  programming 
required.  Very  high  cost  In  time  is  paid 
for  generality.  (JOHNNIAC  interpret- 
ing JOSS). 

Only  1  special  user  -i-  1  other  user  is  al- 
lowed. User  programs  must  be  writ- 
ten at  special  locations  or  with  special 
conventions,  or  loaded  or  assembled 
Into  place.  The  time  to  change  bits  if 
a  user  job  Is  changed  makes  the 
method  nearly  useless.  No  memory 
allocation  by  hardware.  (IBM  1800) 

No  memory  allocation  by  hardware.  (SDS 
Sigma  2) 

Not  general.  Expensive.  Memory  reloca- 
tion must  be  done  by  conventions  or 
by  relocation  software.  A  fixed,  small 
number  of  users  are  permitted  by  the 
hardware.  No  memory  allocation  by 
hardware.  A  program  cannot  be  moved 
until  It  is  run  to  completion.  (IBM 
System/360) 


Chapter  3  |  The  computer  space  79 


Rcloration  and  protection:  Mv  <  Mp: 

One  protection  count  and  one  field  reg- 
ister (addresses  formed  and  checl<ed 
by  logical  operations) 


All  programs  are  written  as  though  their  origin 
were  location  0.  The  count  register  deter- 
mines the  number  of  high-order  bits  to  be 
examined.  The  field  register  is  then  com- 
pared for  identity  with  the  requested  address. 


Memory  allocation  blocks  must  be  in 
power  of  2.  Unless  blocks  are  the 
same  size,  the  memory  utilization  can 
be  poor.  Although  faster  than  the  fol- 
lowing scheme  (which  requires  a  hard- 
ware adder),  the  inflexibility  of  loca- 
tion and  size  makes  it  restrictive. 
(IBM  7040) 


One  set  of  protection  and  relocation  reg- 
isters (base  address  and  limit  regis- 
ters). Also  called  boundary  registers. 


All  programs  written  as  though  their  origin  were 
location  0.  The  relocation  register  specifies 
the  actual  location  of  the  user,  and  the  pro- 
tection register  specifies  the  number  of 
words  allowed. 


As  users  enter  and  leave,  primary-mem- 
ory holes  form,  requiring  the  moving 
of  users.  Pure  procedures  can  be  im- 
plemented only  by  moving  impure  part 
adiacent  to  pure  part.  (CDC  6600. 
PDP-6) 


Two  sets  of  protection  and  relocation  reg- 
isters. Two  segments. 


n  >  3  sets  of  protection  and  relocation 
registers. 


Similar  to  above.  Two  discontiguous  physical 
areas  of  memory  can  be  mapped  into  a  homo- 
geneous virtual  memory. 

Similar  to  above.  More  similar  to  page  mapping. 


Similar  to  above.  Simple,  pure  proce- 
dures with  one  data  array  area  can  be 
implemented.  (UNIVAC  1 108,  PDP-10) 

Has  not  been  used  in  any  conventional 
computer. 


Mapping.  Mv  >  Mp: 
Memory  page  mapping 


Memory  page  segmentation  mapping 


Indirect  references  through  a  descriptor 
table  to  segments. 


For  each  page  (2''  to  2'-  words)  in  a  user's  vir- 
tual memory,  corresponding  information  is 
kept  concerning  the  actual  physical  location 
in  primary  or  secondary  memory.  If  the 
map  IS  m  primary  memory,  it  may  be  desir- 
ableto  have  "associative  registers"  at  the 
processor-memory  interface  to  remember 
previous  reference  to  virtual  pages,  and  their 
actual  locations.  Alternatively,  a  hardware 
map  may  be  placed  between  the  processor 
and  memory  to  transform  processor  virtual 
addresses  into  physical  addresses. 

Additional  address  space  is  provided  beyond  a 
virtual  memory  above  by  providing  a  seg- 
ment number.  This  segment  number  ad- 
dresses or  selects  the  page  tables.  This  al- 
lows a  user  an  almost  unlimited  set  of  ad- 
dresses. Both  segmentation  and  page  map 
look-up  is  provided  in  hardware.  May  be 
thought  of  as  two-dimensional  addressing. 

All  data  are  considered  part  of  a  descriptor 
array  which  is  referred  to  by  a  number.  A 
descriptor  table  indexed  by  the  descriptor 
number  is  used  to  locate  the  array  in  Mp 
and  give  its  size. 


Relatively  expensive.  Not  as  general  as 
following  method  for  implementing 
pure  procedures.  (Atlas,  CDC-3500. 
SDS-940) 


Expensive.  Little  experience  to  judge 
effectiveness.  (GE  645.  IBM  360/67) 


An  indirect  reference  must  be  made  to 
the  description  table  in  Mp.  (B  5500) 


I 


Part  1  I  The  structure  of  computers 


z  is  encountered  in  the  program,  the  information  at  Mp[z]  will 
be  obtained.  There  are  still,  however,  two  different  ways  to  obtain 
the  effect  of  a  virtual  memory. 

First,  one  can  operate  interpretively,  with  a  software  system 
taking  the  place  of  hardware.  That  is,  the  programs  of  all  the  users 
are  in  a  nonmachine  language  (e.g.,  a  higher  procedure-oriented 
language),  and  each  access  in  the  language  is  processed  by  the 
software  interpreter  before  an  access  is  made  to  Mp.  It  is  clear 
that  all  the  logical  power  of  a  memory  mapping  is  available  with 
this  scheme.  The  only  drawback  is  the  loss  of  efficiency  from  the 
interpretation,  which  may  range  from  a  factor  of  5  to  100.  Conse- 
quently this  scheme  is  used  only  in  special  circumstances,  such 
as  multiuser  time-shared  conversational  algebraic  languages. 

The  second  scheme  is  to  modifv  the  code  at  the  time  it  is  placed 
in  the  Mp  for  a  given  mn,  so  that  all  addresses  in  the  code  corre- 
spond to  the  actual  Mp  addresses  used.  That  is,  an  assembly  or 
translation  operation  is  performed  each  time  the  program  is  placed 
in  Mp.  The  advantage  of  this  scheme  is  that  no  further  address 
calculations  are  necessary.  There  are  three  disadvantages.  Assem- 
bly operations  are  expensive  so  that,  although  the  scheme  is  tolera- 
ble if  the  program  is  brought  in  once  and  nm  to  completion,  it 
is  not  tolerable  if  programs  are  continually  being  swapped  in  and 
out  of  Mp.  In  addition,  the  program  must  be  laid  into  continuous 
intervals  of  Mp  corresponding  to  predetermined  segments  of  the 
program,  for  assembly  occurs  on  a  static  representation  of  the 
program  and  cannot  unravel  the  potential  effect  of  address  algo- 
rithms. Finally,  the  size  of  Mv  (i.e.,  the  addresses  used  externally) 
must  be  not  greater  than  Mp. 

Relative  to  these  software  schemes — one  interpretive  and  verv 
expensive  and  one  involving  assembly  (i.e.,  compilation)  and  load- 
ing— the  hardware  schemes  to  be  described  appear  as  address 
interpreters,  where  the  cost  of  continuous  interpretation  has  been 
made  tolerable. 

Protection  for  words  or  pages  hardware.  There  are  three  schemes 
in  Table  6  that  provide  a  means  of  protecting  one  part  of  Mp 
against  references  from  other  programs.  The  rationale  for  these 
designs  is  that  there  will  be  onlv  two  users  (or  user  classes),  one 
user  being  superior  and  assumed  perfect  (its  program  debugged). 
References  to  Mp  via  the  imperfect  program  to  a  perfected  and 
superior  part  of  Mp  are  forbidden.  These  schemes  provide  no 
method  of  hardware  mapping,  and  physical  addresses  are  the  same 
as  virtual  addresses.  In  the  simplest  scheme,  as  in  the  IBM  1800 
(Chap.  33),  a  protect  bit  is  added  to  every  word  in  Mp,  that  is, 

Mp[0:  216  _  i]<0:  (w  -  1),  protectjiit) 


Every  reference  Mv[z]  takes  place  as 

Mv[z]  :  =  (-,Mp[z]<protect„bit>  Mp[z]; 

Mp[z]<protect„bit)  ^  protection  violation  *—  1) 

That  is,  any  reference  to  a  word  with  a  protect  bit  causes  an  error. 
The  other  two  schemes  protect  on  the  basis  of  blocks  of  words. 

Protection  and  relocating  register(s)  hardware.  A  protection  and 
relocation  register  mechanism  is  used  in  four  schemes  of  Table 
6.  These  provide  either  one  concatenated,  one  additive,  two  addi- 
tive, or  n  additive  register  pairs  for  mapping  a  single  program  into 
one,  one,  two,  or  n  nonadjacent  blocks  in  Mp.  The  authors  know 
of  no  schemes  where  more  than  three  registers  are  used;  this  would 
reallv  be  akin  to  using  a  more  general  page  map.  Generally,  these 
schemes  restrict  Mv  <  Mp. 

An  additive  protection  and  relocation  register  pair  is  shown 
in  Fig.  15  in  which  four  users  are  occupying  a  Mp[0:7999].  Each 
user  program  is  written  to  occupy  a  continuous  address  space  in 
a  virtual  Mv.  Thus  in  ISP,  when  Pc  is  running  programs  for  user-j, 
which  address  Mv[z],  with  z  varying  from  0  to  Vj  —  I  the  map- 
ping uses  actual  memory.  The  action  is 

Mv[z]  :  =((z  <  Protection)^  Mp[z  -|-  Relocation]; 

z  >  Protection-^  (Protection  violation  «—  1)) 

Protection  and  Relocation  are  the  two  registers  that  specify  map- 
ping. The  implementation  of  this  scheme  generally  takes  the  form 
of  adding  the  contents  of  the  relocation  register  after  all  address 
calculations  have  taken  place.  Thus,  in  PMS  we  might  think  of 
the  structure 

Mp — K(address  translation) — Pc. 
M('  Protection  ,Relocation) 

Page-map  hardware.  Figiue  16  shows  the  memory  allocation  using 
a  page  map.  Note  that,  of  the  4,096  words  it  is  possible  to  define 
by  the  map,  the  range  1,024  to  2,047  is  actually  undefined.  Along 
with  the  map  containing  the  addresses  to  words  in  actual  Mp,  it 
is  desirable  to  have  accessor  protection  control  information.  Such 
information  might  specify: 

1  No  restrictions  (any  form  of  reading  or  writing  can  take 
place). 

2  Read  only  as  data. 

3  Read  only  as  a  program. 

4  Writing. 

5  Undefined. 


Chapter  3     The  computer  space  81 


6  Defined  but  located  in  Ms. 

7  This  page  has  been  written  in  (to  know  whether  a  copy  in 
Ms  has  to  be  updated). 

8  This  page  has  been  accessed. 

This  scheme  is  essentially  a  generalization  of  n  protect  relocate 
registers  but  includes  more  control  bits,  suggested  above,  and 
restricts  each  block  to  be  the  same  size.  Note  that  Mv  can  be 
greater  than  Mp.  In  addition,  parts  of  the  virtual  niemorv  may 
remain  unused. 

There  are  two  wavs  the  above  scheme  is  usualh'  implemented: 

1  A  complete  map  is  first  considered  as  a  conventional,  ex- 
plicitly addressed  M  whose  addresses  correspond  to  the 
virtvial-address  pages.  At  a  given  page-memory  address  the 
contents  of  the  map  specifies  the  address  in  Mp.  The  map 
is  similar  to  an  indirect  reference.  However,  the  map  is 
usually  about  10  times  faster  and  about  1/1, 000  the  size, 
since  it  keeps  track  only  of  pages,  not  words.  The  PMS 
structure  is 

Mp — M.map — Pc 

2  The  map  is  retained  in  Mp  and  referenced  b\  a  protection 
and  relocation  register  which  are  set  for  the  particular  active 
user.  In  order  to  avoid  making  references  to  Mp  for  each 
word  reference  to  Mv  bv  a  Pc,  a  small,  fast  M(content  ad- 
dress) is  placed  between  Pc  and  Mp.  The  PMS  structure  is 


Mp 


Pc 


-  L(data) 


-  K(address  translation)  «—  L{addresses) 
M(content  address;  8—16  words) 


Mcnwrij-seomentation  licirdwarc.  Figure  9  (page  .574)  in  the  intro- 
duction to  the  IBM  System '.360  shows  the  logical  mapping  process 
for  a  segmented  memory.  There  is  provision  for  a  yer\'  large  two- 
dimensional  virtual-address  space.  This  scheme  is  discussed  exten- 
sivelv  in  the  literature  [.\rden  et  al..  1966;  Dennis,  1965;  Gibson, 
1966].  The  physical  implementation  is  similar  to  that  of  paging. 
Note  that  two  levels  of  mapping  are  provided;  the  segment  map 
and  the  page  maps.  The  two  levels  facilitate  the  sharing  of  a  single 
segment  by  two  jobs. 

The  Burroughs  B5()00  (Chap.  22)  and  the  later  B  6500  have 
a  mapping  that  is  more  closelv  integrated  into  the  Pc  because  thev 


Relocotion  fTV^'^ 
/ 

Protection  |  3  |  -i- 


Hordware  registers  ^ 
when  user  2  is  f 
running 


Table  of  user  location  information 


1  User 

User 
oddresses 
in  1,000s 
of  words 

Relocotion 
register 

Protection 
register 

1 

0<  2 

0 

2 

2 

0<  3 

3 

3 

3 

0<  2 

6 

2 

4 

0<  1 

2 

1 

-"User-memory"  addresses 
words 


1,000s  of 


"Absolute  memory"  oddresses  m  1,000s  of  words 


Fig.  15.  Memory  allocation  using  a  boundary  (relocation  and  protection) 
register. 

provide  a  variable-sized  address  space  (not  paged)  within  a  seg- 
ment. The  segments  are  named,  and  a  large  number  of  segments 
exist. 

Interprogram  communication 

The  dimension  of  interprogram  commimication  is  completely  cor- 
related with  the  multiprogrannning  dimension  as  we  have  previ- 
ously noted.  To  have  a  problem  of  intercommunication,  there  must 
be  a  structure  of  components  that  require  communication.  At  the 
simplest  level  the  dimension  is  represented  by  a  single  program, 
and  there  is  no  need  for  interconimimication.  V'ariables  of  the 


t'....,..  - 

10 
11 

Addresses 

12 

0 

13 

0,  (2-4) 

2046-4095  for  U, 

1 

\  14, 

2 

13  ^ 

3 

14  ^ 

\l6 

^op  locating  user,  6^5 

-(0-1) 

0-1023  for  Uj 

virtual  memory  in 
absolute"  memory 

 j 

Absolute  memory 

Fig.  16.  Memory  allocation  using  a  page  allocation  map. 


82  Part  1  I  The  structure  of  computers 


program  are  completelv  accessible  to  the  whole  program,  and  the 
address  space  is  essentially  uniform. 

The  second  value  of  the  dimension,  subroutine  calling,  produces 
a  hierarchy  of  communication  contexts.  There  is  not  a  fixed  num- 
ber of  levels  to  the  hierarchy,  since  each  subroutine  may  call  others 
ad  nauseiim.  When  subroutines  are  present,  address  names  and 
values  within  the  subroutine  become  addresses  which  are  local 
to  that  part  of  the  subprogram.  Such  a  structuring  is  apparent 
when  looking  at  the  higher-level  languages  such  as  FORTR.W, 
ALGOL,  and  PL/L  where  there  are  explicit  statements  for  con- 
trolling the  names  (addresses)  that  are  available  to  each  of  the 
parts  of  the  program.  The  concept  of  subroutine  structure  has  been 
with  us  almost  from  the  first  programs. 

The  next  value  of  the  dimension  relates  to  signaling  within  a 
single  process.  It  is  akin  to  subroutines  embedded  in  hardware. 
These  are  called  e.xtracodes  and  were  perhaps  first  suggested  for 
the  .\tlas  (Chap.  23).  Each  e.xtracode  can  be  looked  at  as  just  a 
call  to  a  specific  subroutine.  The  variables  of  the  user  (caller's) 
program  are  made  available  to  the  called  (extracode  defined) 
program.  The  calling  usually  is  accompanied  by  a  context  shift, 
in  which  a  completely  different  program  (one  that  is  used  by  any 
number  of  calling  programs)  takes  command  to  interpret  the  in- 
struction. This  scheme  is  used  in  systems  which  are  controlled  by 
a  special  software  monitor.  When  a  function  such  as  the  input 
or  output  of  a  file  is  required,  the  main  program  issues  a  call  to 
the  monitor  to  make  the  transfer.  (In  theory,  the  monitor  knows 
about  conditions  in  the  system  and  has  the  capability  to  perform 
the  complex  function.)  A  central  monitor  control  can  then  begin 
to  run  another  program  if  the  request  is  one  which  would  normally 
halt  the  computer.  This  form  of  communication  is  useful  to  supply 
e.xtra  facilities  to  users  and  to  have  a  method  of  knowing  what 
the  users  are  doing  (e.g.,  so  that  equipment  will  be  better  utilized). 

As  more  complex  program  structures  are  directly  represented 
by  the  hardware,  the  intercommunication  complexity  also  in- 
creases beyond  the  simple  subroutine  call.  If  a  segmented-memory 
scheme  is  used,  the  problem  of  communicating  between  the  seg- 
ments can  be  solved  in  a  range  of  ways.  The  value  of  the  range 
would  be  somewhere  between  ignoring  the  problem  with  the 
hardware  and  providing  methods  for  naming  of  addresses  between 
the  communicating  segments. 

In  the  above  cases,  the  communication  among  the  various 
programs  or  parts  of  programs  is  done  explicitly  by  one  program 
to  another  program.  The  instruction  trap  does  not  fit  this  view 
so  nicely.  Here,  conditions  occurring  within  a  single  process  which 
are  not  explicitly  called  cause  another  part  of  the  program  to  be 


called.  Typical  conditions  which  cause  traps  are  arithmetic  results 
outside  expected  range  or  erroneous  program  conditions  (e.g., 
trying  to  call  someone  else's  program).  The  trap  causes  a  change 
in  context  that  is  synchronized  with  the  process  causing  it.  Trap- 
ping is  a  form  of  program  interruption;  a  trap  is  an  intraprocess 
interrupt  as  distinct  from  interprocess  intermpts. 

Intercommunication  between  two  independent  processes  (being 
carried  out  by  two  independent  components)  is  usually  accom- 
plished by  using  the  program  intemipt.  The  interrupting  process 
requests  that  a  program  interrupt  occur  in  a  component  (inter- 
niptee).  The  interrupter  s  request  is  acknowledged  by  the  inter- 
ruptee,  and  a  change  of  process  state  occurs  in  the  interruptee; 
a  new  process  is  then  run  in  the  interruptee  on  behalf  of  the 
internipter.  The  program  interrupt  is  used  among  processors  in 
a  multiprocessor  system  and  between  IPc  and  nPio's.  A  control 
K  may  also  use  the  program-interrupt  request  to  communicate 
with  its  superior  Pio  or  Pc.  For  example,  a  Pio  does  not  usually 
have  the  logical  capability  to  execute  an  algorithm  which  would 
decide  that  action  is  to  be  taken  for  various  error  conditions. 

Usually  the  interruptee  is  equipped  with  certain  logic  which 
is  capable  of  arranging  priorities  of  requesting  interrupters.  The 
typical  kinds  of  interrupt  requests  are  component  faults  (e.g., 
parity  error),  a  timer  has  counted  down,  and  various  task  comple- 
tions (e.g.,  a  program  has  completed,  a  tape  unit  has  rewound, 
a  disk  arm  has  stopped  moving,  a  certain  record  has  been  found 
on  tape,  a  buffer  is  full). 

State  diagrams  would  show  how  each  of  the  communication 
methods  above  are  similar  to  one  another.  A  typical  interrupt  state 
diagram  is  shown  in  Fig.  17.  There  are  four  states;  normal  process 
interpretation,  process  state  saving,  interrupt  process  interpreta- 
tion, and  process  state  restoration.  The  sequence  is  as  follows: 

1  Normal  instniction  interpretation  is  occurring  in  the  inter- 
ruptee. 

2  The  interrupter  requests  an  internipt. 

3  ."Vfter  some  delay,  t.acknowledgment,  a  state  is  reached  in 
which  part  of  the  interruptee's  process  state  is  saved. 

4  .\fter  t.acknowledgment  -I-  t.save,  a  program  is  nmning  in 
the  interruptee  in  response  to  the  internipter. 

5  The  interrupt  program  is  run  for  t. internipt. 

6  At  the  completion  of  the  interrupt  program,  the  original 
process  state  is  restored  in  the  interrupter. 

7  After  t. restore,  normal  processing  resumes  in  the  inter- 
rupter. 


Chapter  3  [  The  computer  space  83 


The  significant  attributes  of  the  system  are  the  various  times  re- 
quired to  move  from  state  to  state.  These  times  are  directly  related 
to  the  amount  of  process  state  which  must  be  saved  (and  restored) 
when  switching  context. 

The  intercommunication  proljlem  is  probably  the  least  under- 
stood dimension  in  the  computer  space.  It  is  rather  intimately 
related  to  the  ISP,  in  that  the  various  calling  methods  (implicitly 
and  explicitly)  depend  on  the  ISP.  .\lso,  the  amount  of  processor 
state  (a  function  of  the  ISP)  affects  the  response  time  for  making 
context  transitions.  Most  interrupt  systems  allow  several  inde- 
pendent classes  and/or  sources  of  interrupters.  The  classes  are 
arranged  in  priority  so  that  lower-level  interrupters  are  ignored 
until  higher-level  interrupt  programs  are  run  to  completion  (see 
Chap.  42  on  the  SDS  91()-9.3()()  series).  The  design  problems  as- 
sociated with  intercommunication  are  not  those  of  implementa- 
tion but  of  knowing  what  should  be  implemented.  The  PMS 
structure  part  and  the  corresponding  register-transfer  implementa- 
tions for  intercommunication  are,  by  comparison,  straightforward. 

Processor  conctirrenci/ 

Concurrency  (parallelism)  in  the  processor  is  the  number  of  events 
or  logical  operations  that  are  happening  at  a  given  time.  If  the 
basic  logic  technology  is  held  constant,  decreasing  the  processing 
time  (increasing  the  power)  requires  increasing  the  niuiiber  of 
parallel  operations.  An  exact  measure  of  parallelism  can  be  made 
in  terms  of  the  number  of  n-bit  operations  made  per  clock  pulse. 
The  parallelism  in  a  structure  is  also  a  measure  of  its  complexity; 
to  have  a  highly  parallel  structure  implies  control  structure  to- 
gether with  multiple  data  paths  (and  operations)  which  can  be 
concurrently  evoked. 

Processor  parallelism  is  also  necessar\  to  overcome  Mp  speed 
technological  boundaries.  Thus  it  is  difficult  to  isolate  completely 
the  processor  from  the  memory. 

Flynn  [1966]  categorized  high-speed  processors  b\  whether 
there  are  single  or  multiple  instruction  streams  and  whether  each 
stream  has  single  or  multiple  data  streams.  The  CDC  6600  and 
IBM  Stretch  are  examples  of  a  single  instniction  stream  and  a 
single  data  stream.  An  ILLIAC  IV  processor  has  a  single  instruc- 
tion stream  with  multiple  data  streams.  Thus,  the  single  instruction 
stream  and  midtiple  data  stream  are  a  form  of  array  processing 
in  which  an  instruction  performs  an  operation  on  multiple  data 
elements. 

The  CDC  6600  main  processor  has  multiple  instructions  of  a 
single  stream  in  the  fetch,  buffering,  and  decoding  process  at  a 
given  time.  In  addition,  instructions  are  being  executed  in  parallel 


Interrupt  request 
from  interrupter, 

/Preser ve\_ 

after  t. acknowledge.,-—  ' 

Iprocessorr" 
state  y 

"--^Begin  interrupt 
progrorn 

/ Interpret  instructionX 
V  (normal  mterpretotiony 

tsove 

Interpret  \^ 
{      instruction  in  Mp  ^ 
I     (interpretation  in  J 
\interrupted  state)/ 

1           /  Interrupt 
I         /  complete 

"7  Restore  Y~ 
(processor) 
V  state  / 

End  interrupt       \  \ 
program              \  1 

No  interrupt 
request 

t  restore 

Interrupt 
program 
execution 

Fig.  17.  State  diagram  for  the  Interrupt  process. 


by  the  10  parallel  data-operations.  The  6600  has  functionall)'  differ- 
ent data  operators,  although  a  system  could  exist  in  which  these 
operators  are  the  same,  or,  if  the  operator  were  much  faster,  a 
single  unit  could  be  used  sequentially.  Depending  on  the  utiliza- 
tion of  the  10  data  units,  there  could  be  a  computer  with  several 
processors  which  share  a  common  set  of  data-operations.  The  660()'s 
peripheral  processors  are  implemented  in  a  mode  whereby  several 
instructions  streams  are  processed  in  parallel  by  a  single  processor. 
The  simplicity  of  the  shared  processor  for  multiprocessing  or 
parallel  processing  thereby  provides  still  another  form  of  parallel- 
ism. The  following  subsections  discuss  particular  forms  of  paral- 
lelism. .\t  one  end  of  the  dimension  there  is  the  most  primitive 
structure,  a  serial  processor,  and  at  the  other  end  there  are  pipe- 
line processors. 

Serial  processors.  At  the  most  elementary  level  only  one  bit  of  an 
n-bit  word  is  operated  on  at  a  given  time.  There  is  no  concurrency, 
and  even  the  most  trivial  operations  on  n  bits  requires  a  time  of 
n.  The  bit-serial  processor  was  used  in  the  first  generation  because 
the  cyclic  primary  memories  to  which  it  connected  were  funda- 
mentally bit-serial  (see  page  7.3).  .\lthough  the  processor  memory 
could  be  made  to  operate  on  a  parallel  basis  where  words  were 
available  in  one  unit  of  time,  such  a  tradeoff  was  not  worthwhile 
because  of  the  relatively  long  access  time  to  Mp.  The  word  lengths 
for  serial  processors  tended  to  be  relatively  long,  because  the  cost 
is  independent  of  word  length  (see  page  2I6i. 

ParaUel-bij-icord  processors.  The  simple  parallel-by-word  processor 
is  the  most  common  processor  of  the  first  to  third  generation.  This 
occurred  in  part  because  Mp  became  parallel  by  word.  W  ithin 


84  Part  1  |  The  structure  of  computers 


the  processor  we  assume  that  almost  every  internal  register- 
transfer  operation  requires  one  or  more  clock  times.  (A  simple 
multiply  operation  usually  takes  between  n/2  and  2n  clock  times.) 
We  do  net  mean  to  rule  out  multiple  simultaneous  internal  opera- 
tions within  the  processor,  but  they  are  exceptions.  With  only  a 
view  of  a  processor's  registers,  it  is  easy  to  tell  if  multiple  opera- 
tions are  possible.  Most  of  these  processors  do  only  one  operation 
at  a  time.  As  a  rule,  the  simple  processor  is  locked  to  the  primary- 
memory  cycle  time  (usually  core).  Approximately  2  —  10  events 
(clock  times)  are  available  within  the  processor.  For  example,  the 
PDP-8  (Chap.  5)  has  four  events,  and  the  IBM  7090  (Chap.  41) 
has  10  events,  A  precise  measure  of  parallelism  would  count  the 
number  of  operations  per  clock  time  for  given  program  conditions. 

Multiple  instruction  streams,  1  Pc.  The  only  example  of  this 
stmcture  in  the  book  is  the  CDC  6600.  Opportunities  for  such 
a  structure  are  possible  with  the  parallel  computer  suggested  by 
Lehmann  (Chapter  .37). 

Multiple  data  streams.  The  most  obvious  implementation  of 
multiple  data  streams  with  one  or  more  instruction  streams  is 
the  array  processor.  Part  4,  Section  2  is  devoted  to  these  struc- 
tures. 

l-Instruction  buffer.  The  1-instruction  buffer  is  a  form  of  looking 
ahead  in  the  instruction-interpretation  cvcle  and  is  about  the 
simplest  form  of  parallelism  in  a  parallel-by-word  processor.  A 
single  register  is  assigned  the  role  of  holding  the  next  instruction 
to  be  interpreted.  The  IBM  7094  Instruction  Backup  Register 
(Chap.  41)  is  typical  of  this  case.  In  the  7094  two  instructions  are 
fetched  at  a  time.  More  generally  the  next  instruction  would  be 
fetched  during  the  execution  of  the  current  instruction. 

n-lnstniction  bu  ffering.  Multiple  instruction  buffering  is  a  general- 
ization of  the  1-instruction  buffer  above.  It  can  take  several  forms 
depending  on  the  algorithms  used  to  fetch  the  next  instruction 
(i.e.,  the  look-ahead)  and  the  organization  of  the  memory  holding 
the  instructions.  Stretch  (Chap.  .34)  and  the  CDC  6600  (Chap.  39) 
use  instruction  buffers.  A  small,  restricted  content-addressable 
memory  holds  a  block  of  instructions.  In  the  simplest  ca.se  of  these 
computers  a  block  of  memory,  relative  to  the  instruction  counter, 
is  kept  in  the  local  instruction  buffer  memory. 

Look-aside  buffering  (slave)  memories.  Look-aside  is  a  more  general 
form  of  instruction  buffering  because  both  instructions  and  com- 
monly accessed  data  tend  to  migrate  to  the  faster  look-aside 


memory.  This  scheme  is  discussed  for  the  IBM  System/360  Model 
8.5  (page  574).  The  look-aside  memory  suggested  by  Wilkes 
[1965]  is  a  content-addressable  memory  for  retaining  the  active 
(most  recently  used)  memory  words. 

Pipeline  processing.  Pipeline  (assembly-line)  concurrency  is  the 
name  given  to  a  system  of  multiple  functional  units,  each  of  which 
is  responsible  for  partial  interpretation  and  execution  of  the  in- 
struction stream.  A  pipeline  processor  has  several  partially  com- 
pleted instructions  in  process  at  one  time.  Each  processor  stage 
operates  on  a  specific  part  of  the  instmction,  e.g.,  instruction  fetch, 
effective-addre.ss  calculation,  operand  fetching,  execution  of  opera- 
tion specified  by  the  instmction,  and  results  storing.  A  PMS  dia- 
gram for  a  pipeline  processor  is  given  in  Fig.  19.  Thus  there  is 
a  separate  functional  unit  for  each  state  suggested  by  the  state 
diagram  of  Fig.  4.  There  must  be  interlocks  so  that  sequence  is 
preserved,  i.e.,  so  that  results  are  not  used  until  they  are  available. 
Figure  18  shows  a  time/function  diagram  of  a  pipeline  processor. 
There  are  at  least  three  instructions  being  interpreted  simultane- 
ously. Although  we  have  not  extended  Fig.  18,  we  would  expect 
the  processor  in  the  sketch  to  operate  on  about  eight  instructions 


toq, 

toq, 

;too, 

tov,  tov,      tO|          1  ^ 

tov,'                       tQv;       l"strucf.on  1 

L 

toqj 

toqj 

fpOgltovg  tavgj  to  2 

tovg'  tov2            f    ^"struction  2 

toq  3 

toqj  to&j 

tov3  tavj        1  t03 

>  Instruction  3 

tOVj  tOVj' 

tov  3 

tav3" 

Time, 

t 

toq 

Ope 

ration 

time  to  dete 

rmine  instruction  q 

tO( 

Acc 

ess  ti 

me 

to  deter 

■nine  instruction  q 

tov 

Ope 

fotpon 

time  to  det 

ermine  datum  v 

tov 

Acc 

me  to  deter 

mine  datum  v 

to 

Ope 

me  tor  mstruction 

too 

Ope 

ti 

me  to  determine  operation  of  instruction 

tq 

Total  inst 

uction  time 

Fig.  18.  Time-function  diagram  for  a  pipeline  processor. 


Chapter  3     The  computer  space  85 


l-Mp 


(program  counter 


M. processor  state 

I  


■  K. i  ns t  ruct  i  on  fetch 
M . i  ns  t  rue  t  i  ons 


K.data  fetch- 
M.data 


-K.data  store 
I 

M.data 


instruction  fetch 


data  setup      execution      data  restore 


Fig.  19.  Example  of  processor  parallelism  by  spatially  independent  control  function  (pipeline  processing)  PMS 
diagram. 


at  one  time.  Note  that  the  processor  sometimes  completes  later 
instructions  first.  In  this  model  there  is  onlv  one  instruction  fetch- 
ing, one  operand  fetching,  and  one  operand  storing  unit,  while 
there  are  multiple  data  operation  units.  The  particular  nimiber 
of  each  type  of  unit  is  obviously  not  fixed  for  all  structures  but 
depends  heavily  on  the  memory  system,  the  number  of  instruction 
streams,  and  the  ISP. 

A  processor  may  require  man\  data-operation  units  in  order 
to  avoid  bottlenecks.  Each  unit  is  independent  and  may  he 
functionallv  capable  of  carrying  out  only  selected  tasks.  Multiple 
data-operations  are  normally  desirable  in  a  pipeline  processor 
so  that  several  operations  can  be  carried  out  at  a  time,  since 
most  of  the  processing  time  within  the  processor  is  spent  on  the 
operations  (e.g.,  multiplication,  division,  shifting,  etc.) 


in  a  multidimensional  space.  The  previous  discussion  has  enumer- 
ated the  values  of  one  dimension,  while  (in  eflPect)  holding  the 
values  of  other  dimensions  constant.  The  dimensions  are  highly 
correlated,  especiallv  with  cost  and  evolutionary  time.  We  have 
been  brief  in  presenting  the  dimensions  because  the  book  is  pri- 
marily about  computer  examples.  However,  one  should  be  able 
to  recognize  the  dimensions  and  values  when  they  are  encountered 
within  the  context  of  a  particular  computer. 

The  remainder  of  the  hook  is  organized  around  these  dimen- 
sions. The  examples  lose  the  identitv  of  dimensions  because  thev 
are  descriptions  of  points  in  the  space  (computers).  Furthermore, 
the  descriptions  themselves  are  not  especially  organized  around 
these  dimensions  but  are  based  on  the  designer's  own  view  of  his 
machine. 


Conclusions 

You  now  have  our  view  of  the  important  aspects  of  the  stored- 
program  computer.  We  have  tried  to  organize  the  parameters  as 
dimensions  so  that  a  computer  can  be  viewed  as  a  point  (or  points) 


References 

.\dam.\60,66,67,68;  .\daniC60;  .\rbuR66;  .\rdeB66;  BowdB53: 
CampRSO;  CasaC62:  ChasG.52;  CoxJ6f5;  DennJ6.5;  FlynM66:  ForrJ5I; 
C;ibsC66:  KnigK66;  MolnC6T;  NiseN66;  RandB68;  RoseS69;  Samu.\57: 
SerrR62;  \VeikM5.5,61.64;  \\  ilkM.51a.6.5;  \VillF49. 


PART  2 


The  Instruction-set  Processor:  main-line  computers 

To  have  a  "main  line"  of  computers  is  to  have  a  family  that  predominates  through 
the  generations.  Predominance  can  probably  best  be  measured  by  the  percentage 
of  distinct  computers  produced  within  the  family,  as  opposed  to  outside  it.  Members 
of  the  family  need  not  all  be  identical:  especially  evolution  over  time  can  be  tolerated. 
But  it  must  be  the  case  that  there  is  at  any  moment  a  "standard"  design  which 
is  seen  as  emerging  from  the  just  prior  "standard"  design. 

Within  these  definitions  there  indeed  has  been  a  main  line  in  computer  systems. 
It  is  based  on  the  Burks,  Goldstine,  and  von  Neumann  memorandum,  reprinted  as 
Chap.  4.  The  most  striking  characteristic  is  the  evolution  from  1  address  organization 
(1),  through  index-register  (1  +  x)  to  general-register  (1  +  g)  organization.  Left 
outside  the  main  line  have  been  multiple-address  organizations,  character  machines, 
and  stack  machines.  This  seems  to  be  an  appropriate  description,  even  though  a 
character  machine  (variable-length  character  string),  the  IBM  1401,  probably  holds 
the  record  for  number  of  machines  produced  (when  each  model  of  the  IBM  Sys- 
tem/360 is  counted  as  a  separate  computer). 

A  second  characteristic  feature  has  been  the  PMS  structure,  which  has  evolved 
from  a  single  P  to  a  Pc-nPio  structure.  This  has  not  been  uniform  within  the  family, 
since  it  applies  only  to  the  larger  members;  the  small  machines,  such  as  the  PDP-8 
(Chap.  5),  have  no  separate  Pio's.  It  might  seem  that  all  computer  systems,  both 
within  and  without  the  family,  have  evolved  in  this  same  way.  But  this  disregards 
the  history  of  computer  development.  For  a  while,  in  the  early  fifties,  there  were 
seen  to  be  two  main  lines  of  potential  development:  scientific  computers,  featuring 
large  computation  and  small  input/output,  and  business  computers,  featuring  small 
computation  and  large  input /output.  The  latter  started  to  develop  into  the  Pc-nPio 
structure  (with  the  IBM  702)  but.  instead  of  a  separate  line  developing,  scientific 
computers  (with  the  IBM  704  and  UNIVAC  computers)  adopted  the  more  powerful 
input/output  structure.  Again,  despite  its  success,  the  1401  has  not  bred  a  new 
generation  of  computer  systems  in  its  image,  either  within  IBM  (where  one  might 
argue  that  the  overriding  consideration  was  to  have  a  uniform  series)  or  by  IBM's 
competitors. 

A  third  characteristic  of  the  main  line  is  the  use  of  binary  as  opposed  to  decimal 
as  the  basic  radix  of  the  machine.  This  affects  both  the  arithmetic  and  whether  logi- 
cal processing  (on  bit  vectors)  can  be  done.  The  issue  seems  almost  settled  in  the 
third  generation,  with  smaller  machines  being  binary  and  larger  machines  having 
multiple  data-types.  The  last  serious  venture  into  a  large  pure  decimal  machine  was 
the  UNIVAC  LARC,  delivered  in  1960.  In  retrospect,  the  difference  in  organizations 
between  binary  and  decimal  machines  seems  small  enough  so  that  we  have  included 
them  all  in  the  same  section. 

There  are  a  number  of  striking  features  that  are  characteristic  of  the  main  line 
but  do  not  differentiate  it  from  any  of  the  alternatives  that  have  actually  been 
produced.  These  features  include  the  stored-program  concept:  the  use  of  sequential 


87 


Part  2     The  instruction-set  processor:  main-line  computers 


instructions  of  the  operator-operand  variety;  the  use  of  the  word  as  an  information 
unit,  within  the  range  of  12  to  64  bits;  and  a  processor  state  of  less  than  100  words. 
Alternative  organizations  are  conceivable,  though  they  have  clearly  not  seemed 
practical  to  computer  designers.  For  instance,  in  the  early  fifties  there  was  an  at- 
tempt to  construct  an  electronic  plugboard  machine,  after  the  fashion  of  the  ENIAC 
and  the  IBM  CPC  (Card  Programmed  Calculator).  And  we  see  in  the  new  programmed 
desk  calculators  (Part  3,  Sec.  4)  yet  another  organization  that  is  rather  far  from 
the  main  line  (but  because  of  low  cost  may  yet  be  a  part  of  the  future  main  line). 
These  desk  calculators,  by  the  way,  are  decimal,  rather  than  binary. 


Section  1 

Processors  with  one  address  per 
instruction 

This  section  is  principally  concerned  with  the  ISP.  It  is  the 
largest  section  in  the  book,  reflecting  the  dominance  of  the 
one-address  organization  during  the  first  two  generations. 
Machines  with  index  registers  are  included,  but  not  machines 
with  general  registers,  which  are  discussed  in  Sec.  2,  Some 
processors  store  two  single-address  instructions  per  word,  fol- 
lowing the  pattern  of  the  IAS'  (von  Neumann)  machine  (Chap. 
4).  In  machines  with  short  word  lengths,  one  single-address 
instruction  is  stored  in  one  or  two  words,  for  example,  in  the 
16-bit  IBM  1800  (Chap.  33)  and  in  the  12-bit  PDP-8  (Chap.  5). 
The  evolution  of  these  machines  can  be  seen  by  comparing  first- 
and  third-generation  machines  (e.g..  Whirlwind  and  the  IBM 
1800).  In  general,  the  section  is  arranged  by  increasing  word 
length,  alternatively  complexity  and  performance. 


Preliminary  discussion  of  the  logical  design  of  an  electronic 
computing  instrument 

This  article  (Chap.  4)  is  important  for  historical  as  well  as  tech- 
nical reasons.  It  is  one  of  a  series-  written  in  1946  prior  to 
building  the  first  fully  stored-program  computer.  Although  its 
authors  were  not  engineers,  it  is  written  with  the  caution  of 
those  responsible  for  the  implementation  of  a  rather  significant 
development  task.  The  major  problems  for  the  computer  are 
identified,  the  alternatives  analyzed,  and  a  rationale  for  each 
decision  is  given.  If  computer  designers  were  all  required  to 
analyze  and  describe  their  machines  in  such  a  fashion  prior 
to  building  them,  there  would  be  fewer,  but  better,  computers. 
Some  of  the  especially  enjoyable  aspects  of  the  discussion  in- 
clude: 


'Institute  for  Advanced  Study,  Princeton  University,  Princeton,  N.J. 
-The  articles  in  the  series  were. 

1.  On  the  Principles  of  Large  Scale  Computing  Machines  (1946)  [Goldstine  and 
von  Neumann.  1963a]. 

2.  Preliminary  Discussion  of  the  Logical  Design  of  an  Electronic  Computing 
Instrument,  pt.  I,  vol.  1  (1946)  [Burks,  Goldstine,  and  von  Neumann.  1963]. 

3.  Planning  and  Coding  of  Problems  for  an  Electronic  Computing  Instrument, 
pt.  II.  vols.  1,  2,  3  (1947-1948)  [Goldstine  and  von  Neumann,  19635,  1963c, 
1963d]. 


1  Selection  of  word  length  and  number  base. 

2  Discussion  of  the  instructions  needed. 

3  Concern  for  the  input/output  structure  and  the  idea  of 
displays  (now  almost  a  reality). 

4  Rationale  for  not  including  floating-point  arithmetic 
(caution  about  the  technology). 

5  The  lack  of  necessity  for  the  rather  trivial  binary-decimal 
conversion  hardware  and  the  idea  of  cost  effectiveness. 

6  Analysis  of  the  addition,  multiplication,  and  division 
hardware  implementation.  (This  description  includes  a 
nice,  one-page  discussion  of  the  average  carry  length  for 
addition.) 

It  is  difficult  to  say  which  machines  have  been  influenced 
by  this  memorandum  since  the  idea  of  data  and  instructions 
stored  together  in  a  homogeneous  primary  memory  is  so  basic 
to  all  computers.  The  idea  of  the  single-address  instruction  set 
and  format  is  at  the  heart  of  all  the  machines  discussed  in  this 
section.  However,  it  did  not  have  index  registers.  Many  of  the 
machines  with  long  word  length,  like  IAS,  use  the  two-instruc- 
tions-per-word  format. 

Subsequent  machines  built  with  only  minor  variations  in- 
clude ORDVAC;  ILLIAC  I  at  the  University  of  Illinois  with  a  40-bit 
electrostatic  memory  and  vacuum-tube  logic;  AVIDAC,  ORACLE, 
MANIAC  I,  WEIZAC,  SILLIAC,  BESK,  DASK,  CSIRAC,  and 
JOHNNIAC  at  the  RAND  Corporation  with  a  40-bit  core  memory 
and  transistor  logic  [Gruenberger,  1968].  Other  similar  com- 
puters include  the  IBM  701  with  a  36-bit  word,  electrostatic 
memory  and  vacuum-tube  logic;  and  the  CDC  1604,  with  a 
48-bit  word,  core  memory,  and  transistor  logic  (possibly  in- 
fluenced by  MANIAC  II). 

The  DEC  PDP-8 

The  PDP-8  is  included  as  Chap.  5  to  illustrate  the  effects  of 
a  12-bit  word  length.  It  is  given  in  detail  using  a  "top-down" 
approach  in  order  that  the  student  may  thoroughly  understand 
it  by  simulating  it,  interpreting  it,  writing  microprograms  that 


89 


Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1  j  Processors  with  one  address  per  instruction 


Mp"- 


. T . consol e- 

■T(paper  tape;   reader | punch ) - 


K  TCFlexowri  ter;  lOchar/s) 

K  T(CRT 

K  T(l  ight;  pen),- 

K  T(film;  camera) 


2  2 
splay;  area : 5   |  '  0 


-S-MsQAiB;  drum;   td  :  1       7600  u s  ; 

61)  liS/w;  12  A  20k8  w;  16  b/w 
-S-Ms[iO,l ,2,3a.3b;  magnetic  tape; 

800-1000  ft;   30   In/sec;  (2-H 
_ index)   b/char;   100  char/in 


'M(toggle  switch;   8  us/w;   32  w;   16  b/w) 

'Pc(50  kop/s;    16  b/w;    1    i ns t rue t i on/w ;    1   add  re s s/ i ns t r uc t i on ; 
H. processor  state(3  w)  ;   technology:  vacuum  tube;   19^8  — 
1966) 

^S(fixed;   from:   Pc;   to:  8  K;  concurrency:  1) 
''Mp(*0:l;   core;  8  us/w;   10214  w;   16  b/w;   taccess:  2  us) 


emulate  it,  making  incremental  modifications  to  it,  and  com- 
pletely redesigning  it.' 

The  PDP-8,  although  not  the  first  12-bit  computer,  achieved 
a  status  that  made  it  the  first  standard  for  small,  low  cost 
dedicated  computers.  There  is  an  active  market  now  for  com- 
puters in  this  size  and  price  range  to  which  the  marketing 
culture  has  responded  with  the  names  microcomputer,  mini- 
computer and  midicomputer  for  8-  to  12-,  12-  to  16-  and  16- 
to  24-bit  word-length  computers,  respectively. - 

The  PDP-8  has  a  nearly  minimal  processor  state  because 
the  address  and  ISP  integers  are  12  bits.  Twelve  bits  is  just  large 
enough  to  represent  data  from  external  physical  process 
environments  (analog  signals)  and  also  just  right  to  address  a 
4096  word  memory.  System  software  (editors,  assemblers, 
compilers,  etc.)  can  surprisingly  all  fit  into  this  sized  memory.'' 
The  processor  state  is  only  26  bits,  and  the  predecessor  PDP-5 
had  a  hardwired  state  of  only  14  bits. 

The  PDP-8  is  also  discussed  in  Part  5,  Sec.  2,  page  396. 

The  Whirlwind  I  computer  ^ 

Whirlwind  I  is  based  on  Wilkes'  EDSAC  at  Manehester  Univer- 
sity. Chapter  6  describes  the  computer  and  gives  a  brief  descrip- 
tion of  vacuum-tube  logic  and  electrostatic  storage-tube  tech- 
nology. The  PMS  structure  of  Whirlwind  I  with  core  memory  is 
given  in  Fig.  1. 

The  Memory  Test  Computer  (MTC)  of  M.l.T.'s  Lincoln  Labora- 
tory was  the  first  computer  to  use  a  core  memory.  MTC  was 
built  to  test  the  memory  which  Whirlwind  I  received  in  August, 
1953.  Subsequent  modifications  included  the  addition  of  an- 
other 2,048-word  magnetic-core  memory  in  September,  1953. 

The  machine's  construction  and  technology  are  outstanding. 
It  has  effective  marginal  checking  and  preventive-maintenance 
test  facilities.  At  the  time  the  machine  was  dismembered  and 
moved  from  M.I.T.,  it  had  a  use  time  availability  of  greater  than 
95  percent.  Although  Whirlwind  I  left  M.l.T.  in  1960,  the  ma- 
chine was  reassembled  and  was  operational  as  late  as  1966. 

The  machine's  PMS  structure  is  a  simple  1  Pc.  The  K  to  Mp 
block  transfers  are  via  the  Pc  on  a  one-at-a-time,  programmed 
basis.  A  single  data  transfer  can  be  initiated  to  a  particular 
device,  thus  providing  some  opportunity  for  input/output  and 
processing  concurrency.  The  simple  structure  is  due  to  the  high 

'Perhaps  also  because  of  one  of  the  author's  (GB)  obvious  attachment. 
=  See  the  computers  in  this  size  range  Chapter  3,  Figure  2,  page  43, 
■■•Conceivably  a  corollary  to  Parkinson's  law:  Programs  expand  to  fill  every  word  in 
the  primary  memory  of  a  computer. 


Fig.  1.  Whirlwind  I  PMS  diagram. 

register  costs  of  the  vacuum-tube  technology;  thus  only  a  single 
central  processor  register  is  provided  to  hold  (or  buffer)  data 
during  a  K  transmission  to  a  T  or  Ms.  Appendix  1  of  Chap.  6, 
which  is  from  the  programming  manual,  gives  its  instruction 
set. 

The  IBIVI  1800 

The  IBM  1800  (Chap.  33)  is  a  third-generation,  16-bit  computer. 
It  is  discussed  in  Part  5,  Sec.  2,  page  396. 

Some  aspects  of  the  logical  (Resign  of  a  control  computer: 
a  case  study 

Chapter  7  presents  the  aerospace  computer  Apollo  designed  by 
M.l.T.'s  Instrumentation  Laboratory.  It  is  presented  in  contrast 
to  the  general-purpose  16-bit  computers.  Whirlwind  (Chap.  6) 
and  the  IBM  1800  (Chap.  33).  The  Apollo  computer  uses  a 
M(read  only)  because  it  is  obviously  a  problem  to  reload  pro- 
grams. Kampe's  SD-2  (Chap.  29)  and  Apollo  (Chap.  7)  are  both 
controllers  and  have  other  similar  design  constraints.  The  IBM 
1800  is  also  used  for  control  purposes.  In  fact,  the  computers 
in  this  section  up  to  and  including  the  24  bit  SDS  910-9300 
series  are  all  designed  for  control  environments.  However,  all 
the  latter  machines  have  a  goal  of  generality  not  present  in  the 
Apollo. 


Section  1  |  Processors  with  one  address  per  instruction 


The  SDS  910-9300  series 

The  SDS  910-9300  computers  are  illustrative  of  typical,  second- 
generation,  24-bit  computers.  The  computers  are  discussed  in 
Part  6.  Sec.  2.  page  542.  Chapter  42  also  attempts  to  show 
how  implementation  affects  performance  for  the  series. 

The  LGP-30  and  LGP-21 

The  LGP-30  and  later  LGP-21  is  presented  in  Chap.  16  and  dis- 
cussed in  Part  3,  Sec.  2,  page  216. 

IBM  650  instruction  logic 

The  IBM  650  (Chap.  17)  is  a  one  plus  one  address  computer. 
Its  attributes  as  a  cyclic-memory  computer,  though  hardly  ap- 
parent at  the  ISP  level,  are  discussed  in  Part  3,  Sec.  2.  page 
216. 

The  IBM  7094  I,  II 

Part  6,  Sec.  1  shows  the  evolution  of  the  IBM  36-bit  scientific 
computers.  The  IBM  7094  II  (Chap.  41)  is  presented  for  many 
reasons  (page  517).  Among  them  are  its  effect  on  the  later  IBM 
System/360  and  its  position  as  the  standard  large  scientific 
computer  of  the  late  fifties  and  early  sixties. 

The  UNIVAC  system 

The  UNIVAC  system,  first  delivered  in  March,  1951,  was  later 
known  as  UNIVAC  I.  UNIVAC  (UNIVersal  Automatic  Computers) 
was  the  second  computer'  to  be  manufactured  by  the  Eckert- 
Mauchly  Computer  Corporation,  subsequently  a  division  of 
Remington-Rand. - 

'The  Eckert-Mauchly  BINAC  was  apparently  the  first  computer  to  be  manu- 
factured by  a  corporation. 

-Eckert-Mauchly  Computer  Corporation  was  initially  independent  of  Remington- 


UNIVAC  IS  a  single-address,  decimal  computer  with  12  digits/ 
word.  Two  instructions  are  stored  per  word.  In  effect,  UNIVAC 
is  a  decimal  version  of  the  IAS  computer.  The  Mp  consists  of 
1,000  words,  made  up  of  10  words/delay  line.  Each  delay  line 
requires  404  microseconds  to  recirculate. 

UNIVAC  is  significant  because  it  was  the  most  important 
computer  during  the  early  1950s.  Its  performance  record  is 
discussed  in  Chap.  8.  The  UNIVSERVO  magnetic-tape  system 
was  rather  advanced  for  1950,  considering  performance,  error 
checking,  and  buffering.  Particularly  nice  is  the  ability  to  parti- 
tion the  input/output  system  for  off-line  printing  and  key 
punching. 

One-level  storage  system 

The  48-bit  Atlas  was  developed  at  Manchester  University  and 
subsequently  manufactured  by  Ferranti  Corp.  (now  part  of  Inter- 
national Computers  and  Tabulators).  The  development  began 
about  1960,  and  the  paper  was  written  in  1962.  The  importance 
of  Atlas  with  respect  to  current  and  future  machines  is  dis- 
cussed in  Part  3.  Sec.  6,  page  274. 

The  engineering  design  of  the  Stretch  computer 

The  IBM  Stretch  (also  called  the  IBM  Model  7030)  single- 
address  computer  (Chap.  34)  is  one  of  the  earliest  computers 
built  to  provide  maximum  computing  power  subject  to  no  ap- 
parent cost,  size,  and  producibility  constraints.  A  discussion 
of  its  importance  is  given  in  Part  5,  Sec.  2,  page  396. 


Chapter  4 


Preliminary  discussion  of  the  logical 
design  of  an  electronic  computing 
instrument^ 

Arthur  W.  Burks  /  Herman  H.  Goldstine  / 
John  Don  Neumann 

PART  I 

1.    Principal  components  of  the  machine 

1.1.  Inasmuch  as  the  completed  device  will  be  a  general-purpose 
computing  machine  it  should  contain  certain  main  organs  relating 
to  arithmetic,  memory-storage,  control  and  connection  with  the 
human  operator.  It  is  intended  that  the  machine  be  fully  automatic 
in  character,  i.e.  independent  of  the  human  operator  after  the 
computation  starts.  A  fuller  discussion  of  the  implications  of  this 
remark  will  be  given  in  Sec.  3  below. 

1.2.  It  is  evident  that  the  machine  must  be  capable  of  storing 
in  some  manner  not  only  the  digital  information  needed  in  a  given 
computation  such  as  boundary  values,  tables  of  functions  (such 
as  the  equation  of  state  of  a  fluid)  and  also  the  intermediate  results 
of  the  computation  (which  may  be  wanted  for  varying  lengths  of 
time),  but  also  the  instnictions  which  govern  the  actual  routine 
to  be  performed  on  the  numerical  data.  In  a  special-purpose 
machine  these  instructions  are  an  integral  part  of  the  device  and 
constitute  a  part  of  its  design  structure.  For  an  all-purpose  machine 
it  must  be  possible  to  instruct  the  device  to  carry  out  any  compu- 
tation that  can  be  formulated  in  numerical  terms.  Hence  there 
must  be  some  organ  capable  of  storing  these  program  orders.  There 
must,  moreover,  be  a  unit  which  can  understand  the.se  instructions 
and  order  their  execution. 

1.3.  Conceptually  we  have  discussed  above  two  different 
forms  of  memory:  storage  of  numbers  and  storage  of  orders.  If, 
however,  the  orders  to  the  machine  are  reduced  to  a  numerical 
code  and  if  the  machine  can  in  some  fashion  distinguish  a  number 
from  an  order,  the  memory  organ  can  be  used  to  store  both  num- 

'From  A.  H.  Taub  (ed.),  "Collected  Works  of  John  von  Neumann,"  vol.  5, 
pp.  34-79,  The  Macniillan  Company,  New  York,  196.3.  Taken  from 
report  to  U.  S.  Army  Ordnance  Department,  1946.  See  also  Bibliography 
Burks.  Goldstine  and  von  Neumann,  1962a,  1962fc,  1963;  and  Goldstine  and 
von  Neumann  1963a,  1963/j,  1963f,  1963(/. 


bers  and  orders.  The  coding  of  orders  into  numeric  form  is  dis- 
cussed in  6.3  below. 

1.4.  If  the  memory  for  orders  is  merely  a  storage  organ  there 
must  exist  an  organ  which  can  automatically  execute  the  orders 
stored  in  the  memory.  We  shall  call  this  organ  the  Control. 

1.5.  Inasmuch  as  the  device  is  to  be  a  computing  machine 
there  must  be  an  arithmetic  organ  in  it  which  can  perform  certain 
of  the  elementary  arithmetic  operations.  There  will  be,  therefore, 
a  unit  capable  of  adding,  subtracting,  multiplying  and  dividing. 
It  will  be  seen  in  6.6  below  that  it  can  also  perform  additional 
operations  that  occur  quite  frequently. 

The  operations  that  the  machine  will  view  as  elementarv  are 
clearly  those  which  are  wired  into  the  machine.  To  illustrate,  the 
operation  of  multiplication  could  be  eliminated  from  the  device 
as  an  elementary  process  if  one  were  willing  to  view  it  as  a  prop- 
erly ordered  series  of  additions.  Similar  remarks  apply  to  division. 
In  general,  the  inner  economy  of  the  arithmetic  unit  is  determined 
by  a  compromise  between  the  desire  for  speed  of  operation — a 
non-elementary  operation  will  generally  take  a  long  time  to  per- 
form since  it  is  constituted  of  a  series  of  orders  given  by  the 
control — and  the  desire  for  simplicity,  or  cheapness,  of  the  ma- 
chine. 

1.6.  Lastly  there  must  exist  devices,  the  input  and  output 
organ,  whereby  the  human  operator  and  the  machine  can  com- 
municate with  each  other.  This  organ  will  be  seen  below  in  4.5, 
where  it  is  discussed,  to  constitute  a  secondary  form  of  automatic 
memory. 

2.    First  remarks  on  the  memory 

2.1.  It  is  clear  that  the  size  of  the  memory  is  a  critical  considera- 
tion in  the  design  of  a  satisfactory  general-purpose  computing 


92 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  93 


machine.  We  proceed  to  discuss  what  quantities  the  memory 
should  store  for  various  types  of  computations. 

2.2.  In  the  sohition  of  partial  differential  equations  the  storage 
requirements  are  likely  to  be  quite  extensive.  In  general,  one  must 
remember  not  only  the  initial  and  boundary  conditions  and  any 
arbitrary  fimctions  that  enter  the  problem  but  also  an  extensive 
niuiiber  of  intermediate  results. 

a  For  equations  of  parabolic  or  hyperbolic  type  in  two  inde- 
pendent variables  the  integration  process  is  essentially  a 
double  induction.  To  find  the  values  of  the  dependent  vari- 
ables at  time  J  -t-  A<  one  integrates  with  respect  to  r  from 
one  boundary  to  the  other  bv  utilizing  the  data  at  time  I 
as  if  they  were  coefficients  which  contribute  to  defining  the 
problem  of  this  integration. 

Not  onlv  must  the  memory  have  sufficient  room  to  store 
these  intermediate  data  but  there  must  be  provisions 
whereby  these  data  can  later  be  removed,  i.e.  at  the  end 
of  the  (t  +  It)  cycle,  and  replaced  by  the  corresponding 
data  for  the  (t  +  2M}  cvcle.  This  process  of  removing  data 
from  the  memory  and  of  replacing  them  with  new  informa- 
tion must,  of  course,  be  done  quite  automatically  under  the 
direction  of  the  control. 

b  For  total  differential  equations  the  memory  requirements 
are  clearly  similar  to,  but  smaller  than,  those  discussed  in 
(a)  above. 

c  Problems  that  are  solved  by  iterative  procedures  such  as 
systems  of  linear  equations  or  elliptic  partial  differential 
equations,  treated  by  relaxation  techniques,  may  be  ex- 
pected to  require  quite  extensive  memory  capacity.  The 
memory  requirement  for  such  problems  is  apparently  much 
greater  than  for  those  problems  in  (a)  above  in  which  one 
needs  only  to  store  information  corresponding  to  the  in- 
stantaneous value  of  one  variable  [t  in  (a)  above],  while  now 
entire  solutions  (covering  all  values  of  all  variables)  must 
be  stored.  This  apparent  discrepancy  in  magnitudes  can, 
however,  be  somewhat  overcome  by  the  use  of  techniques 
which  permit  the  use  of  much  coarser  integration  meshes 
in  this  case,  than  in  the  cases  under  (a). 

2..3.  It  is  reasonable  at  this  time  to  build  a  machine  that  can 
conveniently  handle  problems  several  orders  of  magnitude  more 
complex  than  are  now  handled  by  existing  machines,  electronic 
or  electro-mechanical.  We  consequently  plan  on  a  fully  automatic 
electronic  storage  facility  of  about  4,000  numbers  of  40  binary 
digits  each.  This  corresponds  to  a  precision  of  2'^"  ~  0.9  X  10"^-, 
i.e.  of  about  12  decimals.  We  believe  that  this  memory  capacity 
exceeds  the  capacities  required  for  most  problems  that  one  deals 


with  at  present  by  a  factor  of  about  10.  The  precision  is  also  safely 
higher  than  what  is  required  for  the  great  majority  of  present  day 
problems.  In  addition,  we  propose  that  we  have  a  subsidiary 
memory  of  much  larger  capacity,  which  is  also  fully  automatic, 
on  some  medium  such  as  magnetic  wire  or  tape. 

3.    First  remarks  on  the  control  and  code 

.3.1.  It  is  easy  to  see  by  formal-logical  methods  that  there  exist 
codes  that  are  in  abstracto  adequate  to  control  and  cause  the 
execution  of  any  sequence  of  operations  which  are  individually 
available  in  the  machine  and  which  are,  in  their  entirety,  con- 
ceivable by  the  problem  planner.  The  really  decisive  considera- 
tions from  the  present  point  of  view,  in  selecting  a  code,  are  more 
of  a  practical  nature:  simplicity  of  the  equipment  demanded  bv 
the  code,  and  the  clarity  of  its  application  to  the  actually  impor- 
tant problems  together  with  the  speed  of  its  handling  of  those 
problems.  It  would  take  us  much  too  far  afield  to  discuss  these 
questions  at  all  generally  or  from  first  principles.  We  will  therefore 
restrict  ourselves  to  analyzing  only  the  type  of  code  which  we 
now  envisage  for  our  machine. 

3.2.  There  must  certainK  be  instnictions  for  performing  the 
fundamental  arithmetic  operations.  The  specifications  for  these 
orders  will  not  be  completely  given  until  the  arithmetic  unit  is 
described  in  a  little  more  detail. 

3.3.  It  must  be  possible  to  transfer  data  from  the  memory  to 
the  arithmetic  organ  and  back  again.  In  transferring  information 
from  the  arithmetic  organ  back  into  the  memory  there  are  two 
types  we  must  distinguish:  Transfers  of  numbers  as  such  and  trans- 
fers of  numbers  which  are  parts  of  orders.  The  first  case  is  quite 
obvious  and  needs  no  further  explication.  The  second  case  is  more 
subtle  and  serves  to  illustrate  the  generality  and  simplicity  of  the 
system.  Consider,  by  way  of  illustration,  the  problem  of  interpola- 
tion in  the  system.  Let  us  suppose  that  we  have  formulated  the 
necessary  instructions  for  performing  an  interpolation  of  order  n 
in  a  sequence  of  data.  The  exact  location  in  the  memory  of  the 
(n  -I-  1)  quantities  that  bracket  the  desired  functional  value  is.  of 
course,  a  function  of  the  argument.  This  argument  probably  is 
found  as  the  result  of  a  computation  in  the  machine.  \\"e  thus  need 
an  order  which  can  substitute  a  number  into  a  given  order — in 
the  case  of  interpolation  the  location  of  the  argument  or  the  group 
of  arguments  that  is  nearest  in  our  table  to  the  desired  value.  By 
means  of  such  an  order  the  results  of  a  computation  can  be  in- 
troduced into  the  instructions  governing  that  or  a  different  com- 
putation. This  makes  it  possible  for  a  sequence  of  instructions  to 
be  used  with  different  sets  of  numbers  located  in  different  parts 
of  the  memory. 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


To  summarize,  transfers  into  the  memory  will  be  of  two  sorts: 
Total  substitutions,  whereby  the  quantity  previously  stored  is 
cleared  out  and  replaced  by  a  new  number.  Partial  substitutions 
in  which  that  part  of  an  order  containing  a  memonj  location- 
number — we  assume  the  various  positions  in  the  memory  are 
enumerated  serially  by  memory  location-numbers — is  replaced  by 
a  new  memory  location-number. 

3.4.  It  is  clear  that  one  must  be  able  to  get  numbers  from 
any  part  of  the  memory  at  any  time.  The  treatment  in  the  case 
of  orders  can,  however,  be  more  methodical  since  one  can  at  least 
partially  arrange  the  control  instructions  in  a  linear  sequence. 
Consequently  the  control  will  be  so  constructed  that  it  will  nor- 
mally proceed  from  place  n  in  the  memory  to  place  (n  -|-  1)  for 
its  next  instruction. 

3.5.  The  utility  of  an  automatic  computer  lies  in  the  possi- 
bility of  using  a  given  sequence  of  instructions  repeatedly,  the 
number  of  times  it  is  iterated  being  either  preassigned  or  depend- 
ent upon  the  results  of  the  computation.  When  the  iteration  is 
completed  a  different  sequence  of  orders  is  to  be  followed,  so  we 
must,  in  most  cases,  give  two  parallel  trains  of  orders  preceded 
by  an  instruction  as  to  which  routine  is  to  be  followed.  This  choice 
can  be  made  to  depend  upon  the  sign  of  a  number  (zero  being 
reckoned  as  plus  for  machine  purposes).  Consequently,  we  intro- 
duce an  order  (the  conditional  transfer  order)  which  will,  depend- 
ing on  the  sign  of  a  given  number,  cause  the  proper  one  of  two 
routines  to  be  executed. 

Frequently  two  parallel  trains  of  orders  terminate  in  a  common 
routine.  It  is  desirable,  therefore,  to  order  the  control  in  either 
case  to  proceed  to  the  beginning  point  of  the  common  routine. 
This  unconditional  transfer  can  be  achieved  either  by  the  artificial 
use  of  a  conditional  transfer  or  by  the  introduction  of  an  explicit 
order  for  such  a  transfer. 

3.6.  Finally  we  need  orders  which  will  integrate  the  input- 
output  devices  with  the  machine.  These  are  discussed  briefly  in 
6.8. 

3.7.  We  proceed  now  to  a  more  detailed  discussion  of  the 
machine.  Inasmuch  as  our  experience  has  shown  that  the  moment 
one  chooses  a  given  component  as  the  elementary  memory  unit, 
one  has  also  more  or  less  determined  upon  much  of  the  balance 
of  the  machine,  we  start  by  a  consideration  of  the  memory  organ. 
In  attempting  an  exposition  of  a  highly  integrated  device  like  a 
computing  machine  we  do  not  find  it  possible,  however,  to  give 
an  exhaustive  discussion  of  each  organ  before  completing  its 
description.  It  is  only  in  the  final  block  diagrams  that  anything 
approaching  a  complete  unit  can  be  achieved. 

The  time  units  to  be  used  in  what  follows  will  be: 


1  |usec  =  1  microsecond  =  10  ^  seconds 
1  msec  =  1  millisecond  =  10  seconds 

4.    The  memory  organ 

4.1.  Ideally  one  would  desire  an  indefinitely  large  memory  ca- 
pacity such  that  any  particular  aggregate  of  40  binary  digits,  or 
word  (cf.  2.3),  would  be  immediately  available — i.e.  in  a  time 
which  is  somewhat  or  considerably  shorter  than  the  operation  time 
of  a  fast  electronic  multiplier.  This  may  be  assumed  to  be  practical 
at  the  level  of  about  100  |usec.  Hence  the  availability  time  for  a 
word  in  the  memory  should  be  5  to  50  fisec.  It  is  equally  desirable 
that  words  may  be  replaced  with  new  words  at  about  the  same 
rate.  It  does  not  seem  possible  physically  to  achieve  such  a  capac- 
ity. We  are  therefore  forced  to  recognize  the  possibility  of  con- 
structing a  hierarchy  of  memories,  each  of  which  has  greater 
capacity  than  the  preceding  but  which  is  less  quickly  accessible. 

The  most  common  forms  of  storage  in  electrical  circuits  are 
the  flip-flop  or  trigger  circuit,  the  gas  tube,  and  the  electro- 
mechanical relay.  To  achieve  a  memory  of  n  words  would,  of 
course,  require  about  40n  such  elements,  exclusive  of  the  switching 
elements.  We  saw  earlier  (cf.  2.2)  that  a  fast  memory  of  several 
thousand  words  is  not  at  all  unreasonable  for  an  all-purpose  instru- 
ment. Hence,  about  10^  flip-flops  or  analogous  elements  would  be 
required!  This  would,  of  course,  be  entirely  impractical. 

We  must  therefore  seek  out  some  more  fundamental  method 
of  storing  electrical  information  than  has  been  suggested  above. 
One  criterion  for  such  a  storage  medium  is  that  the  individual 
storage  organs,  which  accommodate  only  one  binary  digit  each, 
should  not  be  macroscopic  components,  but  rather  microscopic 
elements  of  some  suitable  organ.  They  would  then,  of  course,  not 
be  identified  and  switched  to  by  the  usual  macroscopic  wire  con- 
nections, but  by  some  functional  procedure  in  manipulating  that 
organ. 

One  device  which  displays  this  property  to  a  marked  degree 
is  the  iconoscope  tube.  In  its  conventional  form  it  possesses  a  linear 
resolution  of  about  one  part  in  500.  This  would  correspond  to  a 
(two-dimensional)  memory  capacity  of  500  x  500  =  2.5  x  10^. 
One  is  accordingly  led  to  consider  the  possibility  of  storing  elec- 
trical charges  on  a  dielectric  plate  inside  a  cathode-ray  tube. 
Effectively  such  a  tube  is  nothing  more  than  a  myriad  of  electrical 
capacitors  which  can  be  connected  into  the  circuit  by  means  of 
an  electron  beam. 

Actually  the  above  mentioned  high  resolution  and  concomitant 
memory  capacity  are  only  realistic  under  the  conditions  of  tele- 
vision-image storage,  which  are  much  less  exigent  in  respect  to 


Chapter  4  [  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  95 


the  reliability  of  individual  markings  than  what  one  can  accept 
in  the  storage  for  a  computer.  In  this  latter  case  resolutions  of 
one  part  in  20  to  100,  i.e.  memory  capacities  of  400  to  10,000. 
would  seem  to  be  more  reasonable  in  terms  of  equipment  built 
essentially  along  familiar  lines. 

At  the  pre.sent  time  the  Princeton  Laboratories  of  the  Radio 
Corporation  of  America  are  engaged  in  the  development  of  a 
storage  tube,  the  Selectron,  of  the  type  we  have  mentioned  above. 
This  tube  is  also  planned  to  have  a  non-amplitude-sensitive  switch- 
ing system  whereby  the  electron  beam  can  be  directed  to  a  given 
spot  on  the  plate  within  a  quite  small  fraction  of  a  millisecond. 
Inasmuch  as  the  storage  tube  is  the  key  component  of  the  machine 
envisaged  in  this  report  we  are  extremely  fortunate  in  having 
secured  the  cooperation  of  the  RCA  group  in  this  as  well  as  in 
various  other  developments. 

An  alternate  form  of  rapid  memory  organ  is  the  acoustic  feed- 
back delay  line  described  in  various  reports  on  the  EDVAC.  (This 
is  an  electronic  computing  machine  being  developed  for  the 
Ordnance  Department,  U.S.  Army,  by  the  University  of  Pennsvl- 
vania,  Moore  School  of  Electrical  Engineering.)  Inasmuch  as  that 
device  has  been  so  clearly  reported  in  those  papers  we  give  no 
further  discussion.  There  are  still  other  physical  and  chemical 
properties  of  matter  in  the  presence  of  electrons  or  photons  that 
might  be  considered,  but  since  none  is  yet  beyond  the  early  dis- 
cussion stage  we  shall  not  make  further  mention  of  them. 

4.2.  We  shall  accordingly  assume  throughout  the  balance  of 
this  report  that  the  Selectron  is  the  modus  for  storage  of  words 
at  electronic  speeds.  As  now  planned,  this  tube  will  have  a  capac- 
ity of  =  4,096  4,000  binary  digits.  To  achieve  a  total  elec- 
tronic storage  of  about  4,000  words  we  propose  to  use  40  Selec- 
trons,  thereby  achieving  a  memory  of  2'-  words  of  40  binary  digits 
each.  (Cf.  again  2..3.) 

4. .3.  There  are  two  possible  means  for  storing  a  particular 
word  in  the  Selectron  memory — or,  in  fact,  in  either  a  delay  line 
memory  or  in  a  storage  tube  with  amplitude-sensitive  deflection. 
One  method  is  to  store  the  entire  word  in  a  given  tube  and  then 
to  get  the  word  out  bv  picking  out  its  respective  digits  in  a  serial 
fashion.  The  other  method  is  to  store  in  corresponding  places  in 
each  of  the  40  tubes  one  digit  of  the  word.  To  get  a  word  from 
the  memory  in  this  scheme  requires,  then,  one  switching  mech- 
anism to  which  all  40  tubes  are  connected  in  parallel.  Such  a 
switching  scheme  seems  to  us  to  be  simpler  than  the  technique 
needed  in  the  serial  system  and  is,  of  course,  40  times  faster.  VVe 
accordingly  adopt  the  parallel  procedure  and  thus  are  led  to  con- 
sider a  so-called  parallel  machine,  as  contrasted  with  the  serial 
principles  being  considered  for  the  EDVAC.  (In  the  EDV.4C  the 


peculiar  characteristics  of  the  acoustic  delav  line,  as  well  as  various 
other  considerations,  seem  to  justify  a  serial  procedure.  For  more 
details,  cf.  the  reports  referred  to  in  4.1.)  The  essential  difference 
between  these  two  systems  lies  in  the  method  of  performing  an 
addition;  in  a  parallel  machine  all  corresponding  pairs  of  digits 
are  added  simultaneously,  whereas  in  a  serial  one  these  pairs  are 
added  serially  in  time. 

4.4.  To  summarize,  we  assume  that  the  fast  electronic  memory 
consists  of  40  Selectrons  which  are  switched  in  parallel  by  a  com- 
mon switching  arrangement.  The  inputs  of  the  switch  are  con- 
trolled by  the  control. 

4.5.  Inasmuch  as  a  great  many  highly  important  classes  of 
problems  require  a  far  greater  total  memory  than  2^-  words,  we 
now  consider  the  ne.xt  stage  in  our  storage  hierarchy,  .\lthough 
the  solution  of  partial  differential  equations  frequently  involves 
the  manipidation  of  many  thovisands  of  words,  these  data  are 
generally  required  only  in  blocks  which  are  well  within  the  2'- 
capacity  of  the  electronic  memory.  Our  second  form  of  storage 
must  therefore  be  a  medium  which  feeds  these  blocks  of  words 
to  the  electronic  memory.  It  should  be  controlled  by  the  control 
of  the  computer  and  is  thus  an  integral  part  of  the  system,  not 
requiring  human  intervention. 

There  are  evidently  two  distinct  problems  raised  above.  One 
can  choo.se  a  given  medium  for  storage  such  as  teletype  tapes, 
magnetic  wire  or  tapes,  movie  film  or  similar  media.  There  still 
remains  the  problem  of  automatic  integration  of  this  storage 
medium  with  the  machine.  This  integration  is  achieved  logically 
by  introducing  appropriate  orders  into  the  code  which  can  instruct 
the  machine  to  read  or  write  on  the  medium,  or  to  move  it  by 
a  given  amount  or  to  a  place  with  given  characteristics.  We  discuss 
this  question  a  little  more  fully  in  6.8. 

Let  us  return  now  to  the  question  of  what  properties  the  sec- 
ondary storage  medium  should  have.  It  clearly  should  be  able  to 
store  information  for  periods  of  time  long  enough  so  that  only  a 
few  per  cent  of  the  total  computing  time  is  spent  in  re-registering 
information  that  is  "fading  ofiF."  It  is  certainly  desirable,  although 
not  imperative,  that  information  can  be  erased  and  replaced  by 
new  data.  The  medium  should  be  such  that  it  can  be  controlled, 
i.e.  moved  forward  and  backward,  automatically.  This  considera- 
tion makes  certain  media,  such  as  punched  cards,  undesirable. 
^V'hile  cards  can,  of  course,  be  printed  or  read  by  appropriate 
orders  from  some  machine,  they  are  not  well  adapted  to  problems 
in  which  the  output  data  are  fed  directly  back  into  the  machine, 
and  are  required  in  a  sequence  which  is  non-monotone  with  re- 
spect to  the  order  of  the  cards.  The  medium  should  be  capable 
of  remembering  very  large  numbers  of  data  at  a  much  smaller  price 


Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1  |  Processors  with  one  address  per  instruction 


than  electronic  devices.  It  must  be  fast  enough  so  that,  even  when 
it  has  to  be  used  frequently  in  a  problem,  a  large  percentage  of 
the  total  solution  time  is  not  spent  in  getting  data  into  and  out 
of  this  medium  and  achieving  the  desired  positioning  on  it.  If  this 
condition  is  not  reasonably  well  met,  the  advantages  of  the  high 
electronic  speeds  of  the  machine  will  be  largely  lost. 

Both  hght-  or  electron-sensitive  film  and  magnetic  wires  or 
tapes,  whose  motions  are  controlled  by  servo-mechanisms  inte- 
grated with  the  control,  would  seem  to  fulfil  our  needs  reasonably 
well.  We  have  tentatively  decided  to  use  magnetic  wires  since  we 
have  achieved  reliable  performance  with  them  at  pulse  rates  of 
the  order  of  25,000/sec  and  beyond. 

4.6.  Lastly  our  memory  hierarchy  requires  a  vast  quantity  of 
dead  storage,  i.e.  storage  not  integrated  with  the  machine.  This 
storage  requirement  may  be  satisfied  by  a  library  of  wires  that 
can  be  introduced  into  the  machine  when  desired  and  at  that  time 
become  automatically  controlled.  Thus  our  dead  storage  is  really 
nothing  but  an  extension  of  our  secondary  storage  medium.  It 
differs  from  the  latter  only  in  its  availability  to  the  machine. 

4.7.  We  impose  one  additional  requirement  on  our  secondary 
memory.  It  must  be  possible  for  a  human  to  put  words  on  to  the 
wire  or  other  substance  used  and  to  read  the  words  put  on  by 
the  machine.  In  this  manner  the  human  can  control  the  machine's 
functions.  It  is  now  clear  that  the  secondary  storage  medium  is 
really  nothing  other  than  a  part  of  our  input-output  system,  cf. 
6.8.4  for  a  description  of  a  mechanism  for  achieving  this. 

4.8.  There  is  another  highly  important  part  of  the  input- 
output  which  we  merely  mention  at  this  time,  namely,  some 
mechanism  for  viewing  graphically  the  results  of  a  given  compu- 
tation. This  can,  of  course,  be  achieved  by  a  Selectron-like  tube 
which  causes  its  screen  to  fluoresce  when  data  are  put  on  it  by 
an  electron  beam. 

4.9.  For  definiteness  in  the  subsequent  discussions  we  assume 
that  associated  with  the  output  of  each  Selectron  is  a  flip-flop. 
This  assemblage  of  40  flip-flops  we  term  the  Selectron  Register. 

5.    The  arithmetic  organ 

5.1.  In  this  section  we  discuss  the  features  we  now  consider 
desirable  for  the  arithmetic  part  of  our  machine.  We  give  our 
tentative  conclusions  as  to  which  of  the  arithmetic  operations 
should  be  built  into  the  machine  and  which  should  be  pro- 
grammed. Finally,  a  schematic  of  the  arithmetic  unit  is  described. 

5.2.  In  a  discussion  of  the  arithmetical  organs  of  a  computing 
machine  one  is  naturally  led  to  a  consideration  of  the  number 
system  to  be  adopted.  In  spite  of  the  longstanding  tradition  of 


building  digital  machines  in  the  decimal  system,  we  feel  strongly 
in  favor  of  the  binary  system  for  our  device.  Our  frmdamental  unit 
of  memory  is  naturally  adapted  to  the  binary  system  since  we  do 
not  attempt  to  measure  gradations  of  charge  at  a  particular  point 
in  the  Selectron  but  are  content  to  distinguish  two  states.  The 
flip-flop  again  is  truly  a  binary  device.  On  magnetic  wires  or  tapes 
and  in  acoustic  delay  line  memories  one  is  also  content  to  recog- 
nize the  presence  or  absence  of  a  pulse  or  (if  a  carrier  frequency 
is  used)  of  a  pulse  train,  or  of  the  sign  of  a  pulse.  (We  will  not 
discuss  here  the  ternary  possibilities  of  a  positive-or-negative- 
or-no-pulse  system  and  their  relationship  to  questions  of  reliability 
and  checking,  nor  the  very  interesting  possibilities  of  carrier  fre- 
quency modulation.)  Hence  if  one  contemplates  using  a  decimal 
system  with  either  the  iconoscope  or  delay-line  memory  one  is 
forced  into  a  binary  coding  of  the  decimal  system — each  decimal 
digit  being  represented  by  at  least  a  tetrad  of  binary  digits.  Thus 
an  accuracy  of  ten  decimal  digits  requires  at  least  40  binary  digits. 
In  a  true  binary  representation  of  numbers,  however,  about  33 
digits  suffice  to  achieve  a  precision  of  10^'\  The  use  of  the  binary 
system  is  therefore  somewhat  more  economical  of  equipment  than 
is  the  decimal. 

The  main  virtue  of  the  binary  system  as  against  the  decimal 
is,  however,  the  greater  simplicity  and  speed  with  which  the 
elementary  operations  can  be  performed.  To  illustrate,  consider 
multiplication  by  repeated  addition.  In  binary  multiplication  the 
product  of  a  particular  digit  of  the  multiplier  by  the  multiplicand 
is  either  the  multiplicand  or  null  according  as  the  multiplier  digit 
is  1  or  0.  In  the  decimal  system,  however,  this  product  has  ten 
possible  values  between  null  and  nine  times  the  multiplicand, 
inclusive.  Of  course,  a  decimal  number  has  only  logjQ2  ~  0.3  times 
as  many  digits  as  a  binary  number  of  the  same  accuracy,  but  even 
so  multiplication  in  the  decimal  system  is  considerably  longer  than 
in  the  binary  system.  One  can  accelerate  decimal  multiplication 
by  complicating  the  circuits,  but  this  fact  is  irrelevant  to  the  point 
just  made  since  binary  multiplication  can  likewise  be  accelerated 
by  adding  to  the  equipment.  Similar  remarks  may  be  made  about 
the  other  operations. 

An  additional  point  that  deserves  emphasis  is  this:  An  important 
part  of  the  machine  is  not  arithmetical,  but  logical  in  nature.  Now 
logics,  being  a  yes-no  system,  is  fundamentally  binary.  Therefore 
a  binary  arrangement  of  the  arithmetical  organs  contributes  very 
significantly  towards  producing  a  more  homogeneous  machine, 
which  can  be  better  integrated  and  is  more  efficient. 

The  one  disadvantage  of  the  binary  system  from  the  human 
point  of  view  is  the  conversion  problem.  Since,  however,  it  is 
completely  known  how  to  convert  numbers  from  one  base  to 


Chapter  4  j  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument 


another  and  since  this  conversion  can  be  effected  sofety  by  the 
use  of  the  usual  arithmetic  processes  there  is  no  reason  why  the 
computer  itself  cannot  carry  out  this  conversion.  It  might  be 
argued  that  this  is  a  time  consuming  operation.  This,  however, 
is  not  the  case.  (Cf.  9.6  and  9.7  of  Part  II.  Part  II  is  a  report  issued 
under  the  title  Plannino  and  Codina  of  Prohleim  for  an  Electronic 
Computing  Instrument})  Indeed  a  general-purpose  computer,  used 
as  a  scientific  research  tool,  is  called  upon  to  do  a  very  great 
number  of  multiplications  upon  a  relatively  small  amount  of  input 
data,  and  hence  the  time  consumed  in  the  decimal  to  binary 
conversion  is  only  a  trivial  percentage  of  the  total  computing  time. 
A  similar  remark  is  applicable  to  the  output  data. 

In  the  preceding  discussion  we  have  tacitly  assumed  the  de- 
sirability of  introducing  and  withdrawing  data  in  the  decimal 
system.  We  feel,  however,  that  the  base  10  may  not  even  be  a 
permanent  feature  in  a  scientific  instniment  and  consequently  will 
probably  attempt  to  train  ourselves  to  use  numbers  base  2  or  S 
or  16.  The  reason  for  the  bases  8  or  16  is  this:  Since  8  and  16 
are  powers  of  2  the  conversion  to  binary  is  trivial;  since  both  are 
about  the  size  of  10,  they  violate  many  of  our  habits  less  badly 
than  base  2.  (Cf.  Part  II,  9.4.) 

5.3.  Several  of  the  digital  computers  being  built  or  planned 
in  this  country  and  England  are  to  contain  a  so-called  "floating 
decimal  point".  This  is  a  mechanism  for  expressing  each  word  as 
a  characteristic  and  a  mantissa — e.g.  123.45  would  be  carried  in 
the  machine  as  (0.12345,03),  where  the  3  is  the  exponent  of  10 
associated  with  the  number.  There  appear  to  be  two  major  pur- 
poses in  a  "floating"  decimal  point  system  both  of  which  arise  from 
the  fact  that  the  number  of  digits  in  a  word  is  a  constant,  fixed 
by  design  considerations  for  each  particular  machine.  The  first  of 
these  purposes  is  to  retain  in  a  sum  or  product  as  many  significant 
digits  as  possible  and  the  second  of  these  is  to  free  the  human 
operator  from  the  burden  of  estimating  and  inserting  into  a  prob- 
lem "scale  factors" — multiplicative  constants  which  serve  to  keep 
numbers  within  the  limits  of  the  machine. 

There  is,  of  course,  no  denying  the  fact  that  human  time  is 
consumed  in  arranging  for  the  introduction  of  suitable  scale  fac- 
tors. We  only  argue  that  the  time  so  consumed  is  a  very  small 
percentage  of  the  total  time  we  will  spend  in  preparing  an  inter- 
esting problem  for  our  machine.  The  first  advantage  of  the  floating 
point  is,  we  feel,  somewhat  illusory.  In  order  to  have  such  a  floating 
point  one  must  waste  memory  capacity  which  could  otherwise  be 
used  for  carrving  more  digits  per  word.  It  would  therefore  seem 

'See  Bibliography  [Goldstine  and  von  .\eumann,  1963b,  1963c,  1963dj. 
References  in  this  chapter  are  all  to  this  report. 


to  us  not  at  all  clear  whether  the  modest  advantages  of  a  floating 
binary  point  offset  the  loss  of  memory  capacity  and  the  increased 
complexity  of  the  arithmetic  and  control  circuits. 

There  are  certainlv  some  problems  within  the  scope  of  our 
device  which  reallv  require  more  than  2~*"  precision.  To  handle 
such  problems  we  wish  to  plan  in  terms  of  words  whose  lengths 
are  some  fixed  integral  multiple  of  40,  and  program  the  machine 
in  such  a  manner  as  to  give  the  corresponding  aggregates  of  40 
digit  words  the  proper  treatment.  We  must  then  consider  an  addi- 
tion or  multiplication  as  a  complex  operation  programmed  from 
a  number  of  primitive  additions  or  multiplications  (cf.  ^9,  Part 
II).  There  would  seem  to  be  considerable  e.xtra  difficulties  in  the 
wav  of  such  a  procedure  in  an  instmment  with  a  floating  binary 
point. 

The  reader  mav  remark  upon  our  alternate  spells  of  radicalism 
and  conservatism  in  deciding  upon  various  possible  features  for 
our  mechanism.  We  hope,  however,  that  he  will  agree,  on  closer 
inspection,  that  we  are  guided  by  a  consistent  and  sound  principle 
in  judging  the  merits  of  anv  idea.  We  wish  to  incorporate  into 
the  machine — in  the  form  of  circuits — onlv  such  logical  concepts 
as  are  either  necessary  to  have  a  complete  system  or  highly  con- 
venient because  of  the  frequency  with  which  thev  occur  and  the 
influence  they  exert  in  the  relevant  mathematical  situations. 

5.4.  On  the  basis  of  this  criterion  we  definitely  wish  to  build 
into  the  machine  circuits  which  will  enable  it  to  form  the  binary 
sum  of  two  40  digit  numbers.  We  make  this  decision  not  because 
addition  is  a  logicallv  basic  notion  but  rather  because  it  would 
slow  the  mechanism  as  well  as  the  operator  down  enormously  if 
each  addition  were  programmed  out  of  the  more  simple  operations 
of  "and",  "or",  and  "not".  The  same  is  tme  for  the  subtraction. 
Similarl)-  we  reject  the  desire  to  form  products  by  programming 
them  out  of  additions,  the  detailed  motivation  being  very  much 
the  same  as  in  the  case  of  addition  and  subtraction.  The  cases  for 
division  and  square-rooting  are  much  less  clear. 

It  is  well  known  that  the  reciprocal  of  a  number  a  can  be 
formed  to  anv  desired  accuracv  bv  iterative  schemes.  One  such 
scheme  consists  of  improving  an  estimate  X  by  forming  .V  — 
2.V  —  aA"'^.  Thus  the  new  error  1  —  a\'  is  (1  —  aX)^,  which  is  the 
square  of  the  error  in  the  preceding  estimate.  We  notice  that  in 
the  formation  of  .V,  there  are  two  bona  fide  multiplications — we 
do  not  consider  multiplication  by  2  as  a  true  product  since  we 
will  have  a  facility  for  shifting  right  or  left  in  one  or  two  pulse 
times.  If  then  we  somehow  could  guess  l  a  to  a  precision  of  2  ^ 
6  multiplications — 3  iterations — would  suffice  to  give  a  final  result 
good  to  2"^'^.  .\ccordinglv  a  small  table  of  2^  entries  could  be  used 
to  get  the  initial  estimate  of  l  a.  In  this  wav  a  reciprocal  1/a 


Part  2  I  The  instruction-set  processor:  main  line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


could  be  formed  in  6  multiplication  times,  and  hence  a  quotient 
b/a  in  7  multiplication  times.  Accordingly  we  see  that  the  question 
of  building  a  divider  is  really  a  fimction  of  how  fast  it  can  be  made 
to  operate  compared  to  the  iterative  method  sketched  above:  In 
order  to  justify  its  existence,  a  divider  must  perform  a  division  in 
a  good  deal  less  than  7  multiplication  times.  We  have,  however, 
conceived  a  divider  which  is  much  faster  than  these  7  multipli- 
cation times  and  therefore  feel  justified  in  building  it,  especially 
since  the  amount  of  equipment  needed  above  the  requirements 
of  the  multiplier  is  not  important. 

It  is,  of  course,  also  possible  to  handle  square  roots  by  iterative 
techniques.  In  fact,  if  X  is  our  estimate  of  a^^'-,  then  X'  — 
YoiX  +  a/X)  is  a  better  estimate.  We  see  that  this  scheme  involves 
one  division  per  iteration.  As  will  be  seen  below  in  our  more  detailed 
examination  of  the  arithmetic  organ  we  do  not  include  a  square- 
rooter  in  our  plans  because  such  a  device  would  involve  more 
equipment  than  we  feel  is  desirable  in  a  first  model.  (Concerning  the 
iterative  method  of  square-rooting,  cf.  8.10  in  Part  II.) 

5.5.  The  first  part  of  our  arithmetic  organ  requires  little  dis- 
cussion at  this  point.  It  should  be  a  parallel  storage  organ  which 
can  receive  a  number  and  add  it  to  tlie  one  already  in  it.  which 
is  also  able  to  clear  its  contents  and  which  can  transmit  what  it 
contains.  We  will  call  such  an  organ  an  Accumulator.  It  is  quite 
conventional  in  principle  in  past  and  present  computing  machines 
of  the  most  varied  types,  e.g.  desk  multipliers,  standard  IBM 
counters,  more  modem  relay  machines,  the  ENIAC.  There  are  of, 
course,  numerous  ways  to  build  such  a  binary  accumulator.  We 
distinguish  two  broad  types  of  such  devices:  static,  and  dynamic 
or  pulse-type  accumulators.  These  will  be  discussed  in  5.11,  but 
it  is  first  necessary  to  make  a  few  remarks  concerning  the  arith- 
metic of  binary  addition.  In  a  parallel  accumulator,  the  first  step 
in  an  addition  is  to  add  each  digit  of  the  addend  to  the  corre- 
sponding digit  of  the  augend.  The  second  step  is  to  perform  the 
carries,  and  this  must  be  done  in  sequence  since  a  carry  may 
produce  a  carry.  In  the  worst  case,  39  carries  will  occur.  Clearlv 
it  is  inefficient  to  allow  39  times  as  much  time  for  the  second 
step  (performing  the  carries)  as  for  the  first  step  (adding  the  digits). 
Hence  either  the  carries  must  be  accelerated,  or  use  must  be  made 
of  the  average  number  of  carries  or  both. 

5.6.  We  shall  show  that  for  a  sum  of  binary  words,  each  of 
length  n,  the  length  of  the  largest  carry  sequence  is  on  the  average 
not  in  excess  of  -log  n.  Let  p„(v)  designate  the  probability  that 
a  carry  sequence  is  of  length  v  or  greater  in  the  sum  of  two  binary 
words  of  length  n.  Then  clearly  p„(v)  —  p„(v  +  1)  is  the  proba- 
bility that  the  largest  carry  sequence  is  of  length  exactly  i;  and 
the  weighted  average 


is  the  average  length  of  such  carry.  Note  that 

1  [p,iv)  -  p„{v  + 1)]  =  1 

since  p„{v)  =  0  if  c  >  ;i.  From  these  it  is  easily  inferred  that 

1  =  1 

We  now  proceed  to  show  that  /J„(r)      min[l,  (n  —  v  +  l)/2"+^]. 
Observe  first  that 

p„(.)  =  p„_,(v)  +  ^  if  v^n 

Indeed,  p„{v)  is  the  probabilitv  that  the  sum  of  two  n -digit  numbers 
contains  a  carry  sequence  of  length  ^r.  This  probabilitv  obtains 
bv  adding  the  probabilities  of  two  mutuallv  exclusive  alternatives; 
First:  Either  the  n  —  1  first  digits  of  the  two  numbers  by  them- 
selves contain  a  carry  sequence  of  length  This  has  the  proba- 
bility p„_j(d).  Second:  The  n  —  1  first- digits  of  the  two  numbers 
by  themselves  do  not  contain  a  carry  sequence  of  length  ^r.  In 
this  case  any  carry  sequence  of  length  ^i'  in  the  total  numbers 
(of  length  ii)  must  end  with  the  last  digits  of  the  total  sequence. 
Hence  these  must  form  the  combination  1,1.  The  next  r  —  1  digits 
must  propagate  the  carry,  hence  each  of  these  must  form  the 
combination  1,  0  or  0,  1.  (The  combinations  1,  1  and  0,  0  do  not 
propagate  a  carry.)  The  probability  of  the  combination  1,  1  is 
that  one  of  the  alternative  combinations  1,  0  or  0,  1  is  Yj.  The 
total  probability  of  this  sequence  is  therefore  YJ^^f  '^  —  (%)'^^- 
The  remaining  n  —  v  digits  must  not  contain  a  carry  sequence 
of  length  This  has  the  probability  1  —■p^_^,{v).  Thus  the 
probability  of  the  second  case  is  [1  —  /;„_|,(i;)]/2''''"^.  Combining 
these  two  cases,  the  desired  relation 

obtains.  The  observation  that  p„(r)  =  0  if  t  >  n  is  trivial. 

We  see  with  the  help  of  the  formulas  proved  above  that 
p„(r)  —  p„_i(i^)  is  always  ^1/2"+^  and  hence  that  the  sum 

2  [pM  -  pUv)]  =  p,(v) 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument 


is  not  in  excess  of  (/i  —  t  +  l)/2''+'  since  there  are  n  —  u  +  1 
terms  in  the  sum;  since,  moreover,  each  p„(t")  is  a  probability,  it 
is  not  greater  than  1.  Hence  we  have 

pjv)  g  mm|^l,  ~ — J 

Finally  we  turn  to  the  question  of  getting  an  upper  bound  on 
=  i."=iP„("^)-  Choose  K  so  that  2*'  ^  n  ^  2"  +  '.  Then 

1=1  L=K  1=1  i=A-  - 

This  last  expression  is  clearly  linear  in  n  in  the  interval 
2*  ^  n  ^  2'^+!,  and  it  is  =K  for  ii  =  2*  and  =K  +  I  for 
n  —  2*"''*,  i.e.  it  is  =^log  n  at  both  ends  of  this  interval.  Since 
the  function  ^log  n  is  everywhere  concave  from  below,  it  follows 
that  our  expression  is  ^'"log  n  throughout  this  interval.  Thus 
a„  "  -log  n.  This  holds  for  all  K,  i.e.  for  all  ri,  and  it  is  the  in- 
equality which  we  wanted  to  prove. 

For  our  case  n  =  40  we  have  o„  ^  logo40  —  5..3,  i.e.  an  average 
length  of  about  5  for  the  longest  carrv  sequence.  (The  actual  value 
of  Ojii  is  4.62.) 

5.7.  Having  discussed  the  addition,  we  can  now  go  on  to  the 
subtraction.  It  is  convenient  to  discuss  at  this  point  our  treatment 
of  negative  numbers,  and  in  order  to  do  that  right,  it  is  desirable 
to  make  some  observations  about  the  treatment  of  numbers  in 
general. 

Our  numbers  are  40  digit  aggregates,  the  left-most  digit  being 
the  sign  digit,  and  the  other  digits  genuine  binary  digits,  with 
positional  values  2"',  2"-,  .  .  .  ,  2"-'^  (going  from  left  to  right).  Our 
accumulator  will,  however,  treat  the  sign  digit,  too,  as  a  binary 
digit  with  the  positional  value  2" — at  least  when  it  functions  as 
an  adder.  For  numbers  between  0  and  1  this  is  clearly  all  right: 
The  left-most  digit  will  then  be  0,  and  if  0  at  this  place  is  taken 
to  represent  a  -|-  sign,  then  the  number  is  correctlv  expressed  with 
its  sign  and  39  binary  digits. 

Let  us  now  consider  one  or  more  unrestricted  40  binary  digit 
numbers.  The  accumulator  will  add  them,  with  the  digit-adding 
and  the  carrying  mechanisms  functioning  normally  and  identically 
in  all  40  positions.  There  is  one  reservation,  however:  If  a  carry 
originates  in  the  left-most  position,  then  it  has  nowhere  to  go  from 
there  (there  being  no  further  positions  to  the  left)  and  is  "lost". 
This  means,  of  course,  that  the  addend  and  the  augend,  both 
numbers  between  0  and  2.  produced  a  sum  exceeding  2,  and  the 
accumulator,  being  unable  to  express  a  digit  with  a  positional 
value  2^,  which  would  now  be  necessary,  omitted  2.  That  is,  the 


sum  was  formed  correctly,  excepting  a  possible  error  2.  If  several 
such  additions  are  performed  in  succession,  then  the  ultimate  error 
may  be  any  integer  multiple  of  2.  That  is,  the  accumulator  is  an 
adder  which  allows  errors  that  are  integer  multiples  of  2 — it  is 
an  adder  modulo  2. 

It  should  be  noted  that  our  convention  of  placing  the  binarv 
point  immediately  to  the  right  of  the  left-most  digit  has  nothing 
to  do  with  the  structure  of  the  adder.  In  order  to  make  this  point 
clearer  we  proceed  to  discuss  the  possibilities  of  positioning  the 
binary  point  in  somewhat  more  detail. 

We  begin  by  enumerating  the  40  digits  of  our  numbers  (words) 
from  left  to  right.  In  doing  this  we  use  an  index  /i  =  1,  .  .  .  ,  40. 
Now  we  might  have  placed  the  binary  point  just  as  well  between 
digits  /'  and  /  -|-  I.  /  =  0,  .  .  .  ,  40.  Note,  that  /  =  .0  corresponds 
to  the  position  at  the  extreme  left  (there  is  no  digit  h  —  j  =  0); 
;■  =  40  corresponds  to  the  position  at  the  extreme  right  (there  is 
no  position  h  =  j  +  \  z=  41);  and  /  =  1  corresponds  to  our  above 
choice.  Whatever  our  choice  of  /',  it  does  not  affect  the  correctness 
of  the  accumulator  s  addition.  (This  is  equally  true  for  subtraction, 
cf.  below,  but  not  for  multiplication  and  division,  cf.  .5. 8.)  Indeed, 
we  have  merely  multiplied  all  numbers  by  2'"'  (as  against  our 
previous  convention),  and  such  a  "change  of  scale"  has  no  effect 
on  addition  (and  subtraction).  However,  now  the  accumulator  is 
an  adder  which  allows  errors  that  are  integer  multiples  of  2'  it 
is  an  adder  modulo  2'.  We  mention  this  because  it  is  occasionally 
convenient  to  think  in  terms  of  a  convention  which  places  the 
binary  point  at  the  right  end  of  the  digital  aggregate.  Then  /  =  40, 
our  numbers  are  integers,  and  the  accumulator  is  an  adder  modulo 
2^".  We  must  emphasize,  however,  that  all  of  this,  i.e.  all  attribu- 
tions of  values  to  /,  are  purely  convention — i.e.  it  is  solely  the 
mathematician's  interpretation  of  the  functioning  of  the  machine 
and  not  a  physical  feature  of  the  machine.  This  convention  will 
necessitate  measures  that  have  to  be  made  effective  by  actual 
physical  features  of  the  machine — i.e.  the  convention  will  become 
a  physical  and  engineering  reality  only  when  we  come  to  the 
organs  of  multiplication. 

We  will  use  the  convention  ;  =  I,  i.e.  our  numbers  lie  in  0  and 
2  and  the  accumulator  adds  modulo  2. 

This  being  so,  these  numbers  between  0  and  2  can  be  used  to 
represent  all  numbers  modulo  2.  Any  real  number  x  agrees  modulo 
2  with  one  and  only  one  number  i  between  0  and  2 — or,  to  be 
quite  precise:  0  ^  a:  <  2.  Since  our  addition  functions  modulo  2. 
we  see  that  the  accumulator  mav  be  used  to  represent  and  to  add 
numbers  modulo  2. 

This  determines  the  representation  of  negative  numbers:  If 
X  <  0,  then  we  have  to  find  the  unique  integer  multiple  of  2,  2s 


100  Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


(s  =  1,  2,  .  .  .)  such  that  0  ^  x  <  2  for  x  =  x  +  2s  (i.e.  -  2s  ^ 
X  <  2(1  —  s)),  and  represent  x  by  the  digitalization  of  x 

In  this  way,  however,  the  sign  digit  character  of  the  left-most 
digit  is  lost:  It  can  be  0  or  1  for  both  x  S  0  and  x  <  0,  hence 

0  in  the  left-most  position  can  no  longer  be  associated  with  the 
-I-  sign  of  X.  This  may  seem  a  bad  deficiency  of  the  system,  but 
it  is  easy  to  remedy — at  least  to  an  extent  which  suffices  for  our 
purposes.  This  is  done  as  follows: 

We  usually  work  with  numbers  x  between  —1  and  1 — or,  to 
be  quite  precise:  —  1  ^  x  <  1.  Now  the  x  with  0  ^  x  <  2,  which 
differs  from  x  by  an  integer  multiple  of  2,  behaves  as  follows:  If 
X  ^  0,  then  0  ^  x  <  1,  hence  x  =  x,  and  so  0  ;S  x  <  1,  the  left- 
most digit  of  X  is  0.  If  x  <  0,  then  —  1  ^  x  <  0,  hence  x  =  x  -)-  2, 
and  so  1  ^  x<  2,  the  left-most  digit  of  xis  1.  Thus  the  left-most 
digit  (of  x)  is  now  a  precise  equivalent  of  the  sign  (of  .x):  0  corre- 
sponds to  +  and  1  to  —  . 

Summing  up: 

The  accumulator  may  be  taken  to  represent  all  real  numbers 
modulo  2,  and  it  adds  them  modulo  2.  If  x  lies  between  —  1  and 

1  (precisely:  —  1  ^  x  <  1) — as  it  will  in  almost  all  of  our  uses  of 
the  machine — then  the  left-most  digit  represents  the  sign:  0  is  -|- 
and  1  is  —  . 

Consider  now  a  negative  number  x  with  —  1  ^  x  <  0.  Put 
X  =  —  y,  0  <  !/  rSj  1.  Then  we  digitalize  x  by  representing  it  as 
X  -I-  2  =  2  -  1/  =  1  4-  (1  -  !/).  That  is,  the  left-most  (sign)  digit 
of  X  =  —  y  is,  as  it  should  be,  1;  and  the  remaining  39  digits  are 
those  of  the  complement  of  y  =  —  x  =  |x|,  i.e.  those  of  1  —  (/. 
Thus  we  have  been  led  to  the  familiar  representation  of  negative 
numbers  by  complementation. 

The  connection  between  the  digits  of  x  and  those  of  —  x  is  now 
easily  formulated,  for  any  x  =  0.  Indeed,  —  x  is  equivalent  to 

2  -  X  =  {(21  -  2-39)  _  x}  -h  2-39  =  ^  V  2-'  -  .vj  -I-  2-39 

(This  digit  index  /  =  1,  .  .  .  ,  39  is  related  to  our  previous  digit 
index  /i  =  1,  .  .  .  ,  40  by  !  =  /i  —  1.  Actually  it  is  best  to  treat 
i  as  if  its  domain  included  the  additional  value  i  =  0 — indeed 
i  =  0  then  corresponds  to  h  —  \,  i.e.  to  the  sign  digit.  In  any  case 
i  expresses  the  positional  value  of  the  digit  to  which  it  refers  more 
simply  than  h  does:  This  positional  value  is  2-'  —  2-""  i'.  Note 
that  if  we  had  positioned  the  binary  point  more  generally  between 
j  and  /  +  1,  as  discussed  further  above,  this  positional  value  would 
have  been  2-"'-''.  We  now  have,  as  pointed  out  previouslv,  /  —  1.) 
Hence  its  digits  obtain  by  subtracting  every  digit  of  x  from  1 — by 
complementing  each  digit,  i.e.  by  replacing  0  bv  1  and  1  by 


0 — and  then  adding  1  in  the  right-most  position  (and  effecting 
all  the  carries  that  this  may  cause).  (Note  how  the  left-most 
digit,  interpreted  as  a  sign  digit,  gets  inverted  by  this  procedure 
as  it  should  be.) 

A  subtraction  x  —  y  is  therefore  performed  by  the  accumulator, 
Ac,  as  follows:  Form  x  -|-  y',  where  y'  has  a  digit  0  or  1  where 
y  has  a  digit  1  or  0,  respectively,  and  then  add  1  in  the  right-most 
position.  The  last  operation  can  be  performed  by  injecting  a  carry 
into  the  right-most  stage  of  Ac — since  this  stage  can  never  receive 
a  carry  from  any  other  source  (there  being  no  further  positions 
to  the  right). 

5.8.  In  the  light  of  5.7  multiplication  requires  special  care, 
because  here  the  entire  modulo  2  procedure  breaks  down.  Indeed, 
assume  that  we  want  to  compute  a  product  xy,  and  that  we  had 
to  change  one  of  the  factors,  say  .x,  by  an  integer  multiple  of  2, 
say  by  2.  Then  the  product  (x  +  2)y  obtains,  and  this  differs  from 
the  desired  xy  by  2iy.  2y,  however,  will  not  in  general  be  an  integer 
multiple  of  2,  since  y  is  not  in  general  an  integer. 

We  will  therefore  begin  our  discussion  of  the  multiplication 
bv  eliminating  all  such  difficulties,  and  assume  that  both  factors 
X,  y  lie  between  0  and  1.  Or,  to  be  quite  precise:  0  ^  x  <  1, 
0^y<l. 

To  effect  such  a  multiplication  we  first  send  the  multiplier  x 
into  a  register  AR.  the  Arithmetic  Register,  which  is  essentially  just 
a  set  of  40  flip-flops  whose  characteristics  will  be  discussed  below. 
We  place  the  multiplicand  y  in  the  Selectron  Register,  SR  (cf.  4.9) 
and  use  the  accumulator,  Ac,  to  form  and  store  the  partial  prod- 
ucts. We  propose  to  multiply  the  entire  multiplicand  by  the 
successive  digits  of  the  multiplier  in  a  serial  fashion.  There  are, 
of  course,  two  possible  ways  this  can  be  done:  We  can  either  start 
with  the  digit  in  the  lowest  position — position  2-39 — jj^g 
highest  position — position  2-' — and  proceed  successively  to  the 
left  or  right,  respectively.  There  are  a  few  advantages  from  our 
point  of  view  in  starting  with  the  right-most  digit  of  the  multiplier. 
We  therefore  describe  that  scheme. 

The  multiplication  takes  place  in  39  steps,  which  correspond 
to  the  39  (non-sign)  digits  of  the  multiplier  x  —  0,  ^i,J„  .  .  .  , 
^39  =  (0.^i?2'  •  •  •  I  ?39)>  enumerated  backwards:  ^jg,  .  .  .  ,  io'^i- 
Assume  that  the  —  1  first  steps  {k  =  I,  .  .  .  ,  39)  have  already 
taken  place,  involving  multiplication  of  the  multiplicand  y  with 
the  k  —  1  last  digits  of  the  multiplier:  ^jg,  .  .  .  ,  ^4i_j.;  and  that  we 
are  now  at  the  fcth  step,  involving  multiplication  with  the  fcth  last 
digit;  ^4o_t.  Assume  furthermore,  that  .Ac  now  contains  the  quantity 
p^_,,  the  result  of  the  k  —  1  first  steps.  [This  is  the  (k  —  l)st  partial 
product.  For  k  =  I  clearly  po  =  0.]  We  now  form  2pj.  =  -|- 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  101 


'^Pk  =  Pk-i  +  !/a> 


=  0 


for 

for 


U-k  =  0 
U,>-k  =  1 


(1) 


That  is,  we  do  nothing  or  add  ly,  according  to  whether  ^^f,_^  =  0 
or  1.  We  can  then  form  pj.  by  halving  2p^. 

Note  that  the  addition  of  (1)  produces  no  carry  beyond  the  2" 
position,  i.e.  the  sign  digit;  0  g  <  1  is  true  for  h  =  0,  and  if 
it  is  true  for  h  =  k  —  1,  then  (1)  extends  it  to  h  =  k  also,  since 
0  ^  i/j.  <  1.  Hence  the  sum  in  (1)  is  ^0  and  <2,  and  no  carries 
beyond  the  2"  position  arise. 

Hence  p^  obtains  from  2p^  by  a  simple  right  shift,  which  is 
combined  with  tilling  in  the  sign  digit  (that  is  freed  by  this  shift) 
with  a  0.  This  right  shift  is  effected  by  an  electronic  shifter  that 
is  part  of  .\c. 

Now 


P39  = 


=  2-'l2->[2-i 


[2-'{ 


(2- 


ay  +  s,-j8!/) 


Thus  this  process  produces  the  prod\ict  xij,  as  desired.  Note  that 
this  xij  is  the  exact  product  of  .t  and  ij. 

Since  x  and  y  are  39  digit  binaries,  their  exact  product  xy  is 
a  78  digit  binary  (we  disregard  the  sign  digit  throughout).  How- 
ever, Ac  will  only  hold  .39  of  these.  These  are  clearly  the  left  39 
digits  of  xy.  The  right  .39  digits  of  xy  are  dropped  from  Ac  one 
by  one  in  the  course  of  the  .39  steps,  or  to  be  more  specific,  of 
the  .39  right  shifts.  We  will  see  later  that  these  right  .39  digits  of 
.VI/  should  and  will  also  be  conserved  (cf.  the  end  of  this  section 
and  the  end  of  5.12,  as  well  as  6.6.3).  The  left  39  digits,  which 
remain  in  Ac,  should  also  be  rounded  off,  but  we  will  not  discuss 
this  matter  here  (cf.  loc.  cit.  above  and  9.9,  Part  II). 

To  complete  the  general  picture  of  our  multiplication  tech- 
nique we  must  consider  how  we  sense  the  respective  digits  of  our 
multiplier.  There  are  two  schemes  which  come  to  one's  mind  in 
this  connection.  One  is  to  have  a  gate  tube  associated  with  each 
flip-flop  of  .^R  in  such  a  fashion  that  this  gate  is  open  if  a  digit 
is  1  and  closed  if  it  is  null.  We  would  then  need  a  39-stage  counter 
to  act  as  a  switch  which  would  successively  stimulate  these  gate 
tubes  to  react.  A  more  efficient  scheme  is  to  build  into  AR  a  shifter 
circuit  which  enables  AR  to  be  shifted  one  stage  to  the  right  each 
time  .-^c  is  shifted  and  to  sense  the  value  of  the  digit  in  the  right- 
most flip-flop  of  AR.  The  shifter  itself  requires  one  gate  tube  per 
stage.  We  need  in  addition  a  counter  to  count  out  the  .39  steps 
of  the  multiplication,  but  this  can  be  achieved  by  a  six  stage  binary 
counter.  Thus  the  latter  is  more  economical  of  tubes  and  has  one 
additional  virtue  from  our  point  of  view  which  we  discuss  in  the 
next  paragraph. 


The  choice  of  40  digits  to  a  word  (including  the  sign)  is  prob- 
ably adequate  for  most  computational  problems  but  situations 
certainlv  might  arise  when  we  desire  higher  precision,  i.e.  words 
of  greater  length.  \  trivial  illustration  of  this  would  be  the  com- 
putation of  77  to  more  places  than  are  now  known  (about  700 
decimals,  i.e.  about  2,300  binaries).  More  important  instances  are 
the  solutions  of  N  linear  equations  in  iV  variables  for  large  values 
of  A'.  The  extra  precision  becomes  probablv  necessary  when  A' 
exceeds  a  limit  somewhere  between  20  and  40.  A  justification  of 
this  estimate  has  to  be  based  on  a  detailed  theory  of  numerical 
matrix  inversion  which  will  be  given  in  a  subsequent  report.  It 
is  therefore  desirable  to  be  able  to  handle  numbers  of  39fc  digits 
and  signs  by  means  of  program  instructions.  One  way  to  achieve 
this  end  is  to  use  k  words  to  represent  a  39k  digit  number  with 
signs.  (In  this  way  .39  digits  in  each  40  digit  word  are  used,  but 
all  sign  digits  excepting  the  first  one,  are  apparently  wasted;  cf. 
however  the  treatment  of  double  precision  numbers  in  Chapter 
9,  Part  II.)  It  is,  of  course,  necessary  in  this  case  to  instruct  the 
machine  to  perform  the  elementary  operations  of  arithmetic  in 
a  manner  that  conforms  with  this  interpretation  of  /c-word  com- 
plexes as  single  numbers.  (Cf.  9.8-9.10,  Part  II.)  In  order  to  be 
able  to  treat  numbers  in  this  manner,  it  is  desirable  to  keep  not 
39  digits  in  a  product,  but  78;  this  is  discussed  in  more  detail  in 
6.6.3  below.  To  accomplish  this  end  (conserving  78  product  digits) 
we  connect,  via  our  shifter  circuit,  the  right-most  digit  of  Ac  with 
the  left-most  non-sign  digit  of  .\R.  Thus,  when  in  the  process  of 
multiplication  a  shift  is  ordered,  the  last  digit  of  Ac  is  transferred 
into  the  place  in  AR  made  vacant  when  the  multiplier  was  shifted. 

5.9.  To  conclude  our  discussion  of  the  multiplication  of  posi- 
tive numbers,  we  note  this: 

As  described  thus  far,  the  multiplier  forms  the  78  digit  product, 
.VI/.  for  a  39  digit  niultipler  .v  and  a  .39  digit  multiplicand  y.  We 
assumed  .r  ^  0,  y  ^  0  and  therefore  had  xy  ^  0,  and  we  will  only 
depart  from  these  assumptions  in  5.10.  In  addition  to  these,  how- 
ever, we  also  assumed  .v  <  1,  i/  <  1,  i.e.  the  x,  y  have  their  binary 
points  both  immediately  right  of  the  sign  digit,  which  implied  the 
same  for  xy.  One  might  question  the  necessity  of  these  additional 
assiunptions. 

Prima  facie  they  may  seem  mere  conventions,  which  affect  only 
the  mathematician's  interpretation  of  the  functioning  of  the  ma- 
chine, and  not  a  physical  feature  of  the  machine.  (Cf.  the  cor- 
responding situation  in  addition  and  subtraction,  in  5.7.)  Indeed, 
if  X  had  its  binary  point  between  digits  /  and  /  -I-  1  from  the  left 
(cf.  the  discussion  of  5.7  dealing  with  this  ;';  it  also  applies  to  k 
below),  and  y  between  k  and  k  +  1,  then  our  above  method  of 
multiplication  would  still  give  the  correct  result  xy,  provided  that 


102  Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


the  position  of  the  binary  point  in  xij  is  appropriately  assigned. 
Specifically:  Let  the  binary  point  of  xy  be  between  digits  I  and 

1  +  1.  X  has  the  binary  point  between  digits  ;  and  /  +  1,  and  its 
sign  digit  is  0,  hence  its  range  is  0  a;  <  2^"^.  Similarly  ij  has  the 
range  0  ^  y  <  2''~\  and  xij  has  the  range  0  ^  ri/  <  2'"^  Now  the 
ranges  of  x  and  y  imply  that  the  range  of  xy  is  necessarily 
O^xy  <  2'-'  2*^-1  =  2'+*--'.  Hence  /  =  /  +  ^  -  1-  Thus  it  might 
seem  that  our  actual  positioning  of  the  binary  point — immediately 
right  of  the  sign  digit,  i.e.  j  =  k  =  1 — is  still  a  mere  convention. 

It  is  therefore  important  to  realize  that  this  is  not  so:  The 
choices  of  /  and  k  actually  correspond  to  very  real,  physical,  engi- 
neering decisions.  The  reason  for  this  is  as  follows:  It  is  desirable 
to  base  the  running  of  the  machine  on  a  sole,  consistent  mathe- 
matical interpretation.  It  is  therefore  desirable  that  all  arithmeti- 
cal operations  be  performed  with  an  identically  conceived  posi- 
tioning of  the  binary  point  in  Ac.  Applying  this  principle  to  x  and 
y  gives ;'  =  k.  Hence  the  position  of  the  binary  point  for  xy  is  given 
by  /  -I-  ^'  —  1  2;  —  1.  If  this  is  to  be  the  same  as  for  .r,  and  y, 
then  2;  —  1  =  /,  i.e.  /  —  I  ensues — that  is,  our  above  positioning 
of  the  binary  point  immediately  right  of  the  sign  digit. 

There  is  one  possible  escape:  To  place  into  Ac  not  the  left  .39 
digits  of  xy  (not  counting  the  sign  digit  0),  but  the  digits  /  to  ;  -I-  38 
from  the  left.  Indeed,  in  this  way  the  position  of  the  binary  point 
of  xy  will  be  (2/  —  1)  —  (/  —  1)  =     the  same  as  for  x  and  y. 

This  procedure  means  that  we  drop  the  left  /  —  1  and  right 
40  +  j  digits  of  xy  and  hold  the  middle  39  in  Ac.  Note  that  posi- 
tioning of  the  binary  point  means  that  .r  <  2'~\  y  <  2'"'  and  xy 
can  only  be  used  if  xy  <  2'~^.  Now  the  assumptions  secure  only 
xy  <  2^'"-.  Hence  xy  must  be  2'"'  times  smaller  than  it  might  be. 
This  is  just  the  thing  which  would  be  secured  by  the  vanishing 
of  the  left  ;■  —  1  digits  that  we  had  to  drop  from  Ac,  as  shown 
above. 

If  we  wanted  to  use  such  a  procedure,  with  those  dropped  left 
/■  —  1  digits  really  existing,  i.e.  with  /t^  1,  then  we  would  have 
to  make  physical  arrangements  for  their  conservation  elsewhere. 
Also  the  general  mathematical  planning  for  the  machine  would 
be  definitely  complicated,  due  to  the  physical  fact  that  Ac  now 
holds  a  rather  arbitrarily  picked  middle  stretch  of  39  digits  from 
among  the  78  digits  of  xy.  Alternatively,  we  might  fail  to  make 
such  arrangements,  but  this  would  necessitate  to  see  to  it  in  the 
mathematical  planning  of  each  problem,  that  all  products  turn 
out  to  be  2'"'  times  smaller  than  their  a  priori  maxima.  Such  an 
observance  is  not  at  all  impossible;  indeed  similar  things  are  un- 
avoidable for  the  other  operations.  [For  example,  with  a  factor 

2  in  addition  (of  positives)  or  subtraction  (of  opposite  sign  quanti- 
ties). Cf.  also  the  remarks  in  the  first  part  of  5.12,  dealing  with 


keeping  "within  range".]  However,  it  involves  a  loss  of  significant 
digits,  and  the  choice  ;  =  1  makes  it  unnecessary  in  multiplication. 

We  will  therefore  make  our  choice  /  =  1,  i  e.  the  positioning 
of  the  binary  point  immediately  right  of  the  sign  digit,  binding 
for  all  that  follows. 

5.10.  We  now  pass  to  the  case  where  the  multiplier  .v  and 
the  multiplicand  y  may  have  either  sign  -I-  or  — ,  i.e.  any  combi- 
nation of  these  signs. 

It  would  not  do  simply  to  extend  the  method  of  5.8  to  include 
the  sign  digits  of  x  and  y  also.  Indeed,  we  assume  —  1  ^  x  <  1, 
—  1  ^  1/  <  I,  and  the  multiplication  procedure  in  question  is  defi- 
nitely based  on  the  ^0  interpretations  of  .r  and  y.  Hence  if  .r  <  0, 
then  it  is  really  using  x  +  2,  and  if  i/  <  0,  then  it  is  really  using 
y  +  2.  Hence  for  .v  <  0,  (/  S  0  it  forms 

{x  +  2)y  =  xy  +  2y 

for  .V  ^  0,  y  <  0  it  forms 

x(y  +  2)  =  xy  +  2x 

for  X  <  0,  X  <  0,  it  forms 

(*  +  2)(y  +  2)  =  xy     2x  +  2y  +  A 

or  since  things  may  be  taken  modulo  2,  xy  +  2x  +  2y.  Hence 
correction  terms  —2y,  —2x  would  be  needed  for  i  <  0,  y  <  0, 
respectively  (either  or  both). 

This  would  be  a  possible  procedure,  but  there  is  one  difficulty: 
As  .vy  is  formed,  the  39  digits  of  the  multiplier  x  are  gradually 
lost  from  AR,  to  be  replaced  by  the  right  39  digits  of  xy.  (Cf.  the 
discussion  at  the  end  of  5.8.)  Unless  we  are  willing  to  build  an 
additional  40  stage  register  to  hold  x,  therefore,  x  will  not  be 
available  at  the  end  of  the  multiplication.  Hence  we  cannot  use 
it  in  the  correction  2x  of  xy,  which  becomes  necessary  for  y  <  0. 

Thus  the  case  x  <  0  can  be  handled  along  the  above  lines,  but 
not  the  case  y  <  0. 

It  is  nevertheless  possible  to  develop  an  adequate  procedure, 
and  we  now  proceed  to  do  this.  Throughout  tliis  procedure  we 
will  maintain  the  assumptions  —  I  ^  .r  <  1,  —  1  ^  y  <  1.  We 
proceed  in  several  successive  steps. 

First:  Assume  that  the  corrections  necessitated  by  the  possi- 
bility of  y  <  0  have  been  taken  care  of.  We  permit  therefore 
y  =  0.  We  will  consider  the  corrections  necessitated  bv  the  possi- 
bihty  of  X  <  0. 

Let  us  disregard  the  sign  digit  of  x,  which  is  1,  i.e.  replace  it 
by  0.  Then  x  goes  over  into  x'  =  .v  —  1  and  as  —  1  ^  .r  <  0,  this 
.v'  will  actually  behave  like  (.r  —  1)  -|-  2  =  .v  -I-  I.  Hence  our 
multiplication  procedure  will  produce  .v'y  —  (.v  -f-  l)y  —  .\y  -|-  y. 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  103 


and  therefore  a  correction  —y  is  needed  at  the  end.  (Note  that 
we  did  not  use  the  sign  digit  of  x  in  the  conventional  way.  Had 
we  done  so,  then  a  correction  —  2i/  would  have  been  necessary, 
as  seen  above.) 

We  see  therefore:  Consider  x  J  0.  Perform  first  all  necessary 
steps  for  forming  x'y{y  §  0),  without  yet  reaching  the  sign  digit 
of  X  (i.e.  treating  x  as  if  it  were  SO).  When  the  time  arrives  at 
which  the  digit  of  x  has  to  become  effective — i.e.  immediately 
after  became  effective,  after  39  shifts  (cf.  the  discussion  near 
the  end  of  5.8) — at  which  time  Ac  contains,  say,  p  (this  corresponds 
to  the  p-jp  of  5.8),  then  form 


=  P 

=  P  -  y 


l„  =  0 
?o  =  1 


This  p  is  .VI/.  (Note  the  difference  between  this  last  step,  forming 
p,  and  the  39  preceding  steps  in  5.8,  forming  pj,       ■  •  •  >  Pss-) 

Second:  Having  disposed  of  the  possibility  x  <  0,  we  may  now 
assume  .r  ^  0.  With  this  assumption  we  have  to  treat  all  i/  '£  0. 
Since  (/  =  0  brings  us  back  entirely  to  the  familiar  case  of  5.8,  we 
need  to  consider  the  case  i/  <  0  only. 

Let  (/'  be  the  number  that  obtains  by  disregarding  the  sign  digit 
of  1^  which  is  1,  i.e.  by  replacing  it  by  0.  Again  if  acts  not  like 
1/  —  1,  but  like  (y  —  1)  +  2  —  ij  +  1.  Hence  the  multiplication 
procedure  of  5.8  will  produce  .VI/'  =  x(y  +  I)  =  xy  +  .r,  and  there- 
fore a  correction  x  is  needed.  (Note  that,  quite  similarly  to  what 
we  saw  in  the  first  case  above,  the  suppression  of  the  sign  digit 
of  y  replaced  the  previously  recognized  correction  —  2.v  by  the 
present  one  —.v.)  As  we  observed  earlier,  this  correction  —  v  cannot 
be  applied  at  the  end  to  the  completed  vi/'  since  at  that  time  .v 
is  no  longer  available.  Hence  we  must  apply  the  correction  —  v 
digitwise,  subtracting  every  digit  at  the  time  when  it  is  last  found 
in  AR,  and  in  a  wav  that  makes  it  effective  with  the  proper  posi- 
tional value. 

Tliircl:  Consider  then  .v  =  0,  Jj,  I,-  ■  ■  •  .  ^39  =  (^i-  ^2  ■  ■  •  ^39'- 
The  39  digits  .  .  .  Igc,  of  .v  are  lost  in  the  course  of  the  39  shifts 
of  the  multiplication  procedure  of  5.8,  going  from  right  to  left. 
Thus  the  operation  No.  k  +  I  (k  =  0.  h  .  .  .  ,  .38,  cf.  5.8)  finds 
^3ci_jt  in  the  right-most  stage  of  AR,  uses  it,  and  then  loses  it 
through  its  concluding  right  shift  (of  both  .\c  and  .AR).  .\fter  this 
step  .39  —  (k  +  1)  =  38  —  k  further  steps,  i.e.  shifts  follow,  hence 
before  its  own  concluding  shift  there  are  still  .39  —  shifts  to  come. 
Hence  the  positional  values  are  2^''"*-  times  higher  than  they  will 
be  at  the  end.  Igg.^  should  appear  at  the  end,  in  the  correcting 
term  —  .v,  with  the  sign  —  and  the  positional  value  2"'^^"*'.  Hence 
we  may  inject  it  during  the  step  k  +  I  (before  its  shift)  with  the 


sign  —  and  the  positional  value  1.  That  is  to  say,  —^39.^  in  the 
sign  digit. 

This,  however,  is  inadmissible.  Indeed,  f  jy_^  might  cause  carries 
(if  ^39_j  =  1),  which  would  have  nowhere  to  go  from  the  sign  digit 
(there  being  no  further  positions  to  the  left).  This  error  is  at  its 
origin  an  integer  multiple  of  2,  but  the  39  —  k  subsequent  shifts 
reduce  its  positional  value  2^^"*  times.  Hence  it  might  contribute 
to  the  end  result  any  integer  multiple  of  2~'3'*"*-' — and  this  is  a 
genuine  error. 

Let  us  therefore  add  1  —  to  the  sign  digit,  i.e.  0  or  1  if 

^39-.  is  1  or  0,  respectively,  ^\'e  will  show  further  below,  that  with 
this  procedure  there  arise  no  carries  of  the  inadmissible  kind. 
Taking  this  momentarily  for  granted,  let  us  see  what  the  total 
effect  is.  We  are  correcting  not  by  —  .v  but  bv  vf5[ 
2  '  —  v  =  1  —  2  ■'^  —  X.  Hence  a  final  correction  by  —  1  -t-  2"''^  is 
needed.  Since  this  is  done  at  the  end  (after  all  shifts),  it  mav  be 
taken  modulo  2.  That  is  to  say,  we  must  add  1  +  2"^^,  i.e.  1  in 
each  of  the  two  e.vtreme  positions,  .adding  1  in  the  right-most 
position  has  the  same  effect  as  in  the  discussion  at  the  end  of  5.7 
(dealing  with  the  subtraction).  It  is  equivalent  to  injecting  a  carry 
into  the  right-most  stage  of  Ac.  .\dding  1  in  the  left-most  position, 
i.e.  to  the  sign  digit,  produces  a  1,  since  that  digit  was  necessarily 
0.  (Indeed,  the  last  operation  ended  in  a  shift,  thus  freeing  the 
sign  digit,  cf.  below.) 

Fourth:  Let  us  now  consider  the  question  of  the  carries  that 
may  arise  in  the  .39  steps  of  the  process  described  above.  In  order 
to  do  this,  let  us  describe  the  kth  step  {k  =  \,  .  .  .  ,  39),  which 
is  a  variant  of  the  kth  step  described  for  a  positive  multiplication 
in  5.8,  in  the  same  way  in  which  we  described  the  original  kth 
step  tor.  cit.  That  is  to  say,  let  us  see  what  the  formula  (1)  of  5.8 
has  become.  It  is  clearly  2p^  -  p^_j  -I-  (1  -  $40-.)  +  m,,-.!/',  i-e. 


-Pk  =  Pk-i  +  y'k 


4 


=  1 
=  y' 


for 
for 


=  0 
=  1 


(2) 


That  is,  we  add  1  (y's  sign  digit)  or  y'  (y  without  its  sign  digit), 
according  to  whether  ^^^^^  =  0  or  1.  Then  p^  should  obtain  from 
2p^.  again  bv  halving. 

Now  the  addition  of  i2)  produces  no  carries  beyond  the  2" 
position,  as  we  asserted  earlier,  for  the  same  reason  as  the  addition 
of  (1)  in  5.8.  We  can  argue  in  the  same  way  as  there:  0  ^  p^  <  1 
is  true  for  h  =  0,  and  if  it  is  true  for  /i  =  /c  —  1,  then  (1)  e.xtends 
it  to  h  =  k  also,  since  0  ^  y\  ^  L  Hence  the  sum  in  (2)  is  SO 
and  <2,  and  no  carries  beyond  the  2"  position  arise. 

Fifth:  In  the  three  last  observations  we  assumed  1/  <  0.  Let 
us  now  restore  the  full  generality  of  1/  J  0.  We  can  then  describe 


104  Part  2  I  The  instruction-set  processor:  main-line  computers  Section  1  j  Processors  witti  one  address  per  instruction 


the  equations  (1)  of  5.8  (valid  for  tj  ^  0)  and  (2)  above  (valid  for 
1/  <  0)  by  a  single  formula, 

2pt  =  Pk-i  +  y'k 

J  „  [  =       sign  digit  for       J^„_^  =  0 

■'^  I  =  !/  without  its  sign  digit       for       ^w-k  =  1 

Thus  our  verbal  formulation  of  (2)  applies  here,  too:  We  add  ij's 
sign  digit  or  y  without  its  sign,  according  to  whether  ^^Q_k  =  0 
or  1.  All  are  ^0  and  <  1,  and  the  addition  of  (3)  never  originates 
a  carry  beyond  the  2"  position,  obtains  from  2p^.  by  a  right 
shift,  filling  the  sign  digit  with  a  0.  (Cf.  however.  Part  II,  Table 
2  for  another  sort  of  right  shift  that  is  desirable  in  explicit  form, 
i.e.  as  an  order.) 

For  y  ^Q,xtj  is  p^c,,  for  y  <  0,  xy  obtains  from  pjg  by  injecting 
a  carry  into  the  right-most  stage  of  Ac  and  by  placing  a  I  into 
the  sign  digit  in  Ac. 

Sixth:  This  procedure  applies  for  x  ^  0.  For  x  <  0  it  should 
also  be  applied,  since  it  makes  use  of  .r's  non-sign  digits  only,  but 
at  the  end  i/  must  be  subtracted  from  the  result. 

This  method  of  binary  multiplication  will  be  illustrated  in  some 
examples  in  5.15. 

5.11.  To  complete  our  discussion  of  the  multiplicative  organs 
of  our  machine  we  must  return  to  a  consideration  of  the  types 
of  accumulators  mentioned  in  5.5.  The  static  accumulator  operates 
as  an  adder  by  simultaneously  applying  static  voltages  to  its  two 
inputs — one  for  each  of  the  two  numbers  being  added.  When 
steady-state  operation  is  reached  the  total  sum  is  formed  complete 
with  all  carries.  For  such  an  accumulator  the  above  discussion  is 
substantially  complete,  except  that  it  should  be  remarked  that  such 
a  circuit  requires  at  most  39  rise  times  to  complete  a  carry. 
Actually  it  is  possible  that  the  duration  of  these  successive  rises 
is  proportional  to  a  lower  power  of  39  than  the  first  one. 

Each  stage  of  a  dynamic  accumulator  consists  of  a  binary 
counter  for  registering  the  digit  and  a  flip-flop  for  temporary 
storage  of  the  carry.  The  counter  receives  a  pulse  if  a  1  is  to  be 
added  in  at  that  place;  if  this  causes  the  counter  to  go  from  1 
to  0  a  carry  has  occurred  and  hence  the  carrv  flip-flop  will  be 
set.  It  then  remains  to  perform  the  carries.  Each  flip-flop  has 
associated  with  it  a  gate,  the  output  of  which  is  connected  to  the 
next  binary  counter  to  the  left.  The  carry  is  begun  by  pulsing  all 
carry  gates.  Now  a  carry  may  produce  a  carry,  so  that  the  process 
needs  to  be  repeated  until  all  carry  flip-flops  register  0.  This  can 
be  detected  by  means  of  a  circuit  involving  a  sensing  tube  con- 
nected to  each  carry  flip-flop.  It  was  shown  in  5.6  that,  on  the 
average,  five  pulse  times  (flip-flop  reaction  times)  are  required  for 
the  complete  carry.  An  alternative  scheme  is  to  connect  a  gate 


tube  to  each  binary  counter  which  will  detect  whether  an  incom- 
ing carry  pulse  would  produce  a  carry  and  will,  under  this  cir- 
cumstance, pass  the  incoming  carry  pulse  directly  to  the  next 
stage.  This  circuit  would  require  at  most  39  rise  times  for  the 
completion  of  the  carry.  (Actually  less,  cf.  above.) 

At  the  present  time  the  development  of  a  static  accumulator 
is  being  concluded.  From  preliminary  tests  it  seems  that  it  will 
add  two  numbers  in  about  5  /isec  and  will  shift  right  or  left  in 
about  1  |iisec. 

We  return  now  to  the  multiplication  operation.  In  a  static 
accumulator  we  order  simultaneously  an  addition  of  the  multi- 
plicand with  sign  deleted  or  the  sign  of  the  multiplicand  (cf.  5.10) 
and  a  complete  carry  and  then  a  shift  for  each  of  the  39  steps. 
In  a  dynamic  accumulator  of  the  second  kind  just  described  we 
order  in  succession  an  addition  of  the  multiplicand  with  sign 
deleted  or  the  sign  of  the  multiplicand,  a  complete  carry,  and  a 
shift  for  each  of  the  .39  steps.  In  a  dynamic  accumulator  of  the 
first  kind  we  can  avoid  losing  the  time  required  for  completing 
the  carry  (in  this  case  an  average  of  5  pulse  times,  cf.  above)  at 
each  of  the  39  steps.  We  order  an  addition  by  the  multiplicand 
with  sign  deleted  or  the  sign  of  the  multiplicand,  then  order  one 
pulsing  of  the  carry  gates,  and  finally  shift  the  contents  of  both 
the  digit  counters  and  the  carry  flip-flops.  This  process  is  repeated 
39  times.  A  simple  arithmetical  analysis  which  may  be  carried  out 
in  a  later  report,  shows  that  at  each  one  of  these  intermediate 
stages  a  single  carry  is  adequate,  and  that  a  complete  set  of  carries 
is  needed  at  the  end  only.  We  then  carry  out  the  complement 
corrections,  still  without  ever  ordering  a  complete  set  of  carry 
operations.  When  all  these  corrections  are  completed  and  after 
round-ofi^,  described  below,  we  then  order  the  complete  carry 
mentioned  above. 

5.12.  It  is  desirable  at  this  point  in  the  discussion  to  consider 
rules  for  rounding-ofl^  to  n-digits.  In  order  to  assess  the  charac- 
teristics of  alternative  possibilities  for  such  properly,  and  in  par- 
ticular the  role  of  the  concept  of  "unbiasedness",  it  is  necessary 
to  visualize  the  conditions  under  which  rounding-off  is  needed. 

Every  number  x  that  appears  in  the  computing  machine  is  an 
approximation  of  another  number  x',  which  would  have  appeared 
if  the  calculation  had  been  performed  absolutely  rigorously.  The 
approximations  to  which  we  refer  here  are  not  those  that  are 
caused  by  the  explicitly  introduced  approximations  of  the  numeri- 
cal-mathematical set-up,  e.g.  the  replacement  of  a  (continuous) 
differential  equation  by  a  (discrete)  difference  equation.  The  effect 
of  such  approximations  should  be  evaluated  mathematically  by  the 
person  who  plans  the  problem  for  the  machine,  and  should  not 
be  a  direct  concern  of  the  machine.  Indeed,  it  has  to  be  handled 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  105 


by  a  mathematician  and  cannot  be  handled  by  the  machine,  since 
its  nature,  complexity,  and  difficulty  may  be  of  anv  kind,  depend- 
ing upon  the  problem  under  consideration.  The  approximations 
which  concern  us  here  are  these:  Even  the  elementary  operations 
of  arithmetic,  to  which  the  mathematical  approximation-formula- 
tion for  the  machine  has  to  reduce  the  true  (possibly  transcenden- 
tal) problem,  are  not  rigorously  executed  by  the  machine.  The 
machine  deals  with  numbers  of  n  digits,  where  n,  no  matter  how 
large,  has  to  be  a  fixed  quantity.  (We  assumed  for  our  machine 
40  digits,  including  the  sign,  i.e.  n  =  39.)  Now  the  sum  and  differ- 
ence of  two  n-digit  numbers  are  again  n-digit  numbers,  but  their 
product  and  quotient  (in  general)  are  not.  (They  have,  in  general, 
2n  or  oo-digits,  respectively.)  Consequently,  multiplication  and 
division  must  unavoidably  be  replaced  by  the  machine  by  two 
different  operations  which  must  produce  n-digits  under  all  condi- 
tions, and  which,  subject  to  this  limitation,  should  lie  as  close  as 
possible  to  the  results  of  the  tnie  multiplication  and  division.  One 
might  call  them  pseudo-multiplication  and  pseudo-division;  how- 
ever, the  accepted  nomenclature  terms  them  a.s  multiplication  and 
division  with  round-off.  (We  are  now  creating  the  impression  that 
addition  and  subtraction  are  entirely  free  of  such  shortcomings. 
This  is  only  tnie  inasmuch  as  they  do  not  create  new  digits  to 
the  right,  as  multiplication  and  division  do.  However,  they  can 
create  new  digits  to  the  left,  i.e.  cause  the  numbers  to  "grow  out 
of  range".  This  complication,  which  is,  of  course,  well  known,  is 
normally  met  by  the  planner,  by  mathematical  arrangements  and 
estimates  to  keep  the  numbers  "\\  ithin  range  ".  Since  we  propose 
to  have  our  machine  deal  with  numbers  between  —  1  and  1, 
multiplication  can  never  cause  them  to  "grow  out  of  range". 
Division,  of  course,  might  cause  this  complication,  too.  The  plan- 
ner must  therefore  see  to  it  that  in  every  division  the  absolute 
value  of  the  divisor  exceeds  that  of  the  dividend.) 

Thus  the  round-off  is  intended  to  produce  satisfactory  li-digit 
approximations  for  the  product  xy  and  the  quotient  x/ij  of  two 
n-digit  numbers.  Two  things  are  wanted  of  the  round-off:  (1)  The 
approximation  should  be  good,  i.e.  its  variance  from  the  "true" 
xy  or  x/y  should  be  as  small  as  practicable;  (2)  The  approximation 
should  be  unbiased,  i.e.  its  mean  should  be  equal  to  the  "true" 
xy  or  x/y. 

These  desiderata  must,  however,  be  considered  in  conjunction 
with  some  further  comments.  Specifically:  (a)  x  and  y  themselves 
are  likely  to  be  the  results  of  similar  round-offs,  directly  or  in- 
directly inherent,  i.e.  x  and  y  themselves  should  be  viewed  as 
unbiased  n-digit  approximations  of  "true"  x'  and  y'  values;  (b)  by 
talking  of  "variances"  and  "means"  we  are  introducing  statistical 
concepts.  Now  the  approximations  which  we  are  here  considering 


are  not  reallv  of  a  statistical  nature,  but  are  due  to  the  peculiarities 
(from  our  point  of  view,  inadequacies)  of  arithmetic  and  of  digital 
representation,  and  are  therefore  actuallv  rigorously  and  uniquelv 
determined.  It  seems,  however,  in  the  present  state  of  mathe- 
matical science,  rather  hopeless  to  try  to  deal  with  these  matters 
rigorously.  Furthermore,  a  certain  statistical  approach,  while  not 
truly  justified,  has  always  given  adequate  practical  results.  This 
consists  of  treating  those  digits  which  one  does  not  wish  to  use 
individually  in  subsequent  calculations  as  random  variables  with 
equiprobable  digital  values,  and  of  treating  any  two  such  digits 
as  statistically  independent  (unless  this  is  patently  false). 

These  things  being  understood,  we  can  now  undertake  to  dis- 
cuss roimd-off  procedures,  realizing  that  we  will  have  to  apply 
them  to  the  multiplication  and  to  the  division. 

Let  .r  =  (.|i  .  .  .  ^„)  and  y  =  (.7)j  .  .  .  7)„)  be  unbiased  approxi- 
mations of  .v' and  (/'.  Then  the  "true"  xi/  =  (4i  ■  ■  ■  ^„S„+i  •  •  ■  ^o„) 
and  the  "true"  x/y  =  (.coj  .  .  .  w„'<^„+i"„+o  .  . .  )  (this  goes  on  ad 
infinitum'.)  are  approximations  ol  x'y'  and  x'/y'.  Before  we  discuss 
how  to  round  them  off,  we  must  know  whether  the  "true"  xy  and 
x/y  are  themselves  unbiased  approximations  of  x'y'  and  x'/y'.  xy 
is  indeed  an  unbiased  approximation  of  x'y',  i.e.  the  mean  of  xy 
is  the  mean  of  .r(  =  .t')  times  the  mean  of  y(  —  y'),  owing  to  the 
independence  assumption  which  we  made  above.  However,  if  .v 
and  y  are  closely  correlated,  e.g.  for  x  =  y,  i.e.  for  squaring,  there 
is  a  bias.  It  is  of  the  order  of  the  mean  square  of  x  —  x',  i.e.  of 
the  variance  of  x.  Since  .t  has  n  digits,  this  variance  is  about  1/2-" 
(If  the  digits  of  x',  beyond  n  are  entirely  unknown,  then  our  original 
assumptions  give  the  variance  1/12.2^".)  Next,  x/y  can  be  written 
as  x.y'^,  and  since  we  have  already  discussed  the  bias  of  the 
product,  it  suffices  now  to  consider  the  reciprocal  y~^.  Now  if 
y  is  an  unbiased  estimate  of  y'.  then  t/"'  is  not  an  unbiased  estimate 
of  !/'"',  i.e.  the  mean  of  i/'s  reciprocal  is  not  the  reciprocal  of  y's 
mean.  The  difference  is  — y~^  times  the  variance  of  y,  i.e.  it  is 
of  essentially  the  same  order  as  the  bias  found  above  in  the  case 
of  squaring. 

It  follows  from  all  this  that  it  is  futile  to  attempt  to  avoid  biases 
of  the  order  of  magnitude  1/2-"  or  less.  (The  factor  above  may 
seem  to  be  changing  the  order  of  magnitude  in  question.  However, 
it  is  really  the  square  root  of  the  variance  which  matters  and 
\/(Yi2  ~~  '^■^  ^  moderate  factor.)  Since  we  propose  to  use  n  =  .39, 
therefore  l/2^*(~3  X  lO"-'')  is  the  critical  case.  Note  that  this 
possible  bias  level  is  l/2^*'(-~2  X  10"^-)  times  our  last  significant 
digit.  Hence  we  will  look  for  round-off  rules  to  n  digits  for 
the    "true"    xy  -  (.J^  .  .  .  J„|„+i  .  .  .  i^J    ^"d   x/y  =  (.co,  .  .  . 

).  The  desideratum  (I)  which  we  formulated 
previously,  that  the  variance  should  be  small,  is  still  valid.  The 


Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


desideratvim  (2),  however,  that  the  bias  should  be  zero,  need, 
according  to  the  above,  only  be  enforced  up  to  terms  of  the  order 
1/22". 

The  round-off  procedures,  which  we  can  use  in  this  connection, 
fall  into  two  broad  classes.  The  first  class  is  characterized  by  its 
ignoring  all  digits  beyond  the  nth,  and  even  the  nth  digit  itself, 
which  it  replaces  by  a  1.  The  second  class  is  characterized  by  the 
procedure  of  adding  one  unit  in  the  {n  +  l)st  digit,  performing 
the  carries  which  this  mav  induce,  and  then  keeping  only  the  n 
first  digits. 

When  applied  to  a  number  of  the  form  (.c J  .  .  .  ''„''„+i''„+2  •  •  •  ) 
(ad  infinitum\),  the  effects  of  either  procedure  are  easily  estimated. 
In  the  first  case  we  may  say  we  are  dealing  with  (.Cj,  .  .  .  , 
plus  a  random  number  of  the  form  (.0  .  .  .  ,  OVj^Vj^^^v^^^  ■  •  •  )> 
i.e.  random  in  the  interval  0,  l/2"~^  Comparing  with  the  rounded 
off  (.i'ii'2  •  •  •  ''n-il).  we  therefore  have  a  difference  random  in  the 
interval  — 1/2",  1/2".  Hence  its  mean  is  0  and  its  variance  '/j  ■  2-". 
In  the  second  case  we  are  dealing  with  {.v^  .  .  .  vj  plus  a  random 
number  of  the  form  (.0  .  .  .  OOi'^^ji'^^j  ■  ■  •  )>  i-^-  random  in  the 
interval  0,  1/2".  The  "rounded-off "  value  will  be  (.Cj  .  .  .  vj  in- 
creased by  0  or  by  1/2",  according  to  whether  the  random  number 
in  question  lies  in  the  interval  0,  1/2"+',  or  in  the  interval  l/2"+\ 
1/2".  Hence  comparing  with  the  "rounded-off  "  value,  we  have 
a  difference  random  in  the  intervals  0,  1/2"+',  and  0,  —1/2"+', 
i.e.  in  the  interval  —1/2"+',  1/2"+'.  Hence  its  mean  is  0  and  its 
variance  (yj2)2-". 

If  the  number  to  be  rounded-off  has  the  form  (./'j  .  .  . 
''n''n+i''n+2  ■  ■  ■  ''n+p)  ip  finite),  then  these  results  are  somewhat 
affected.  The  order  of  magnitude  of  the  variance  remains  the  same; 
indeed  for  large  p  even  its  relative  change  is  negligible.  The  mean 
difference  may  deviate  from  0  by  amounts  which  are  easily  esti- 
mated to  be  of  the  order  1/2"  •  1/2''  =  1/2"+p. 

In  division  we  have  the  first  situation,  x/y  =  (.coj  .  .  . 
'»'n"Il-n"n-^2  ■  •  •  )>  i-e-  p  is  infinite.  In  multiplication  we  have  the 
second  one,  .vy  =  (.^j  .  .  .  J„^„+i  .  .  .  i,J,i.e.p  =  n.  Hence  for  the 
division  both  methods  are  applicable  without  modification.  In 
multiplication  a  bias  of  the  order  of  1/2^"  may  be  introduced.  We 
have  seen  that  it  is  pointless  to  insist  on  removing  biases  of  this 
size.  We  will  therefore  use  the  unmodified  methods  in  this  case, 
too. 

It  should  be  noted  that  the  bias  in  the  case  of  multiplication 
can  be  removed  in  various  ways.  However,  for  the  reasons  set  forth 
above,  we  shall  not  complicate  the  machine  by  introducing  .such 
corrections. 

Thus  we  have  two  standard  "round-off  "  methods,  both  unbiased 
to  the  extent  to  which  we  need  this,  and  with  the  variances 


1/3  •  22",  and  ('/i2)22",  that  is,  with  the  dispersions  (l/V3)(l/2") 
-  0.58  times  the  last  digit  and  (1/2^3X1/2")  =  0.29  times  the 
last  digit.  The  first  one  requires  no  carry  facilities,  the  second  one 
requires  them. 

Inasmuch  as  we  propose  to  form  the  product  x'y'  in  the  accu- 
mulator, which  has  carry  facilities,  there  is  no  reason  why  we 
should  not  adopt  the  rounding  scheme  described  above  which  has 
the  smaller  dispersion,  i.e.  the  one  which  may  induce  carries.  In 
the  case,  however,  of  division  we  wish  to  avoid  schemes  leading 
to  carries  since  we  expect  to  form  the  quotient  in  the  arithmetic 
register,  which  does  not  permit  of  carry  operations.  The  scheme 
which  we  accordingly  adopt  is  the  one  in  which  u„  is  replaced 
by  1.  This  method  has  the  decided  advantage  that  it  enables  us 
to  write  down  the  approximate  quotient  as  soon  as  we  know  its 
first  (n  —  1)  digits.  It  will  be  seen  in  5.14  and  6.6.4  below  that 
our  procedure  for  forming  the  quotient  of  two  numbers  will  always 
lead  to  a  result  that  is  correctly  rounded  in  accordance  with  the 
decisions  just  made.  We  do  not  consider  as  serious  the  fact  that 
our  rounding  scheme  in  the  case  of  division  has  a  dispersion  twice 
as  large  as  that  in  multiplication  since  division  is  a  far  less  frequent 
operation. 

A  final  remark  should  be  made  in  connection  with  the  possible, 
occasional  need  of  carrying  more  than  n  =  39  digits.  Our  logical 
control  is  sufficiently  flexible  to  permit  treating  k  (  —  2,  3,  .  .  .  ) 
words  as  one  number,  and  thus  effecting  n  =  39fc.  In  this  case  the 
round-off  has  to  be  handled  differently,  cf.  Chapter  9,  Part  II.  The 
multiplier  produces  all  78  digits  of  the  basic  39  by  39  digit  multi- 
plication: The  first  39  in  the  Ac,  the  last  39  in  the  AR.  These  must 
then  be  manipulated  in  an  appropriate  manner.  (For  details,  cf. 
6.6.3  and  9.9-9.10,  Part  II.)  The  divider  works  for  39  digits  only: 
In  forming  x/y,  it  is  necessary,  even  if  x  and  1/  are  available  to 
39k  digits,  to  use  only  39  digits  of  each,  and  a  39  digit  result  will 
appear.  It  seems  most  convenient  to  use  this  result  as  the  first  step 
of  a  series  of  successive  approximations.  The  successive  improve- 
ments can  then  be  obtained  by  various  means.  One  way  consists 
of  using  the  well  known  iteration  formula  (cf.  5.4).  For  k  =  2  one 
such  step  will  be  needed,  for  fc  =  3,  4,  two  steps,  for  k  =  5,  6, 
7,  8  three  steps,  etc.  An  alternative  procedure  is  this:  Calculate 
the  remainder,  using  the  approximate,  .39  digit,  quotient  and  the 
complete.  39fc  digit,  divisor  and  dividend.  Divide  this  again  by 
the  approximate,  39  digit,  divisor,  thus  obtaining  essentially  the 
next  39  digits  of  the  quotient.  Repeat  this  procedure  until  the  full 
39A:  desired  digits  of  the  quotient  have  been  obtained. 

5.13.  We  might  mention  at  this  time  a  complication  which 
arises  when  a  floating  binary  point  is  introduced  into  the  machine. 
The  operation  of  addition  which  usually  takes  at  most  '/„,  of  a 


Chapter  4     Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  107 


multiplication  time  becomes  much  longer  in  a  machine  with 
floating  binary  since  one  must  perform  shifts  and  round-offs  as  well 
as  additions.  It  would  seem  reasonable  in  this  case  to  place  the 
time  of  an  addition  as  about  Yj  to  of  a  multiplication.  At  this 
rate  it  is  clear  that  the  number  of  additions  in  a  problem  is  as 
important  a  factor  in  the  total  solution  time  as  are  the  number 
of  multiplications.  (For  further  details  concerning  the  floating 
binary  point,  of.  6.6.7.) 

5.14.  We  conclude  our  discussion  of  the  arithmetic  unit  with 
a  description  of  our  method  for  handling  the  division  operation. 
To  perform  a  division  we  wish  to  store  the  dividend  in  SR,  the 
partial  remainder  in  Ac  and  the  partial  quotient  in  \R.  Before 
proceeding  further  let  us  consider  the  so-called  restoring  and 
non-restoring  methods  of  division.  In  order  to  be  able  to  make 
certain  comparisons,  we  will  do  this  for  a  general  base  m  —  2, 
3  

Assume  for  the  moment  that  divisor  and  dividend  are  both 
positive.  The  ordinarv  process  of  division  consists  of  subtracting 
from  the  partial  remainder  (at  the  very  beginning  of  the  process 
this  is,  of  course,  the  dividend)  the  divisor,  repeating  this  until 
the  former  becomes  smaller  than  the  latter.  For  any  fixed  positional 
value  in  the  quotient  in  a  well-conducted  division  this  need  be 
done  at  most  m  —  1  times.  If.  after  precisely  =  0,  1,  .  .  .  ,  m  —  I 
repetitions  of  this  step,  the  partial  remainder  has  indeed  become 
less  than  the  divisor,  then  the  digit  k  is  put  in  the  quotient  lat 
the  position  under  consideration),  the  partial  remainder  is  shifted 
one  place  to  the  left,  and  the  whole  process  is  repeated  for  the 
ne.xt  position,  etc.  Note  that  the  above  comparison  of  sizes  is  only 
needed  at  A:  =  0,  1,  ....  m  —  2,  i.e.  before  step  1  and  after  steps 
1,  .  .  .  ,  m  —  2.  If  the  value  k  —  m  —  1,  i.e.  the  point  after  step 
m  —  1,  is  at  all  reached  in  a  well-conducted  division,  then  it  may 
be  taken  for  granted  without  any  test,  that  the  partial  remainder 
has  become  smaller  than  the  divisor,  and  the  operations  on  the 
position  under  consideration  can  therefore  be  concluded.  (In  the 
binarv  system,  m  2,  there  is  thus  only  one  step,  and  only  one 
comparison  of  sizes,  before  this  step.)  In  this  way  this  scheme, 
known  as  the  restoring  scheme,  requires  a  maximum  of  m  —  1  com- 
parisons and  utilizes  the  digits  0,  1  ;?i  —  1  in  each  place  in  the 

quotient.  The  difficulty  of  this  scheme  for  machine  purposes  is  that 
usually  the  onl\'  economical  method  for  comparing  two  numbers 
as  to  size  is  to  subtract  one  from  the  other.  If  the  partial  remainder 
r„  were  less  than  the  dividend  d,  one  would  then  have  to  add  d 
back  into  r„  —  d  in  order  to  restore  the  remainder.  Thus  at  every 
stage  an  unnecessary  operation  would  be  performed.  A  more  sym- 
metrical scheme  is  obtained  by  not  restoring.  In  this  method  (from 
here  on  we  need  not  assume  the  positivity  of  divisor  and  dividend) 


one  compares  the  signs  of  r„  and  d;  if  they  are  of  the  same  sign, 
the  dividend  is  repeatedly  subtracted  from  the  remainder  until 
the  signs  become  opposite;  if  they  are  opposite,  the  dividend  is 
repeatedly  added  to  the  remainder  until  the  signs  again  become 
like.  In  this  scheme  the  digits  that  may  occur  in  a  given  place 
in  the  quotient  are  evidently  ±1,  ±2,  ....  ±{m  —  1),  the  posi- 
tive digits  corresponding  to  subtractions  and  the  negative  ones  to 
additions  of  the  dividend  to  the  remainder. 

Thus  we  have  2(m  —  1)  digits  instead  of  the  usual  m  digits. 
In  the  decimal  system  this  would  mean  18  digits  instead  of  10. 
This  is  a  redundant  notation.  The  standard  form  of  the  quotient 
must  therefore  be  restored  by  subtracting  from  the  aggregate  of 
its  positive  digits  the  aggregate  of  its  negative  digits.  This  requires 
carry  facilities  in  the  place  where  the  quotient  is  stored. 

We  propose  to  store  the  quotient  in  AR,  which  has  no  carry 
facilities.  Hence  we  could  not  use  this  scheme  if  we  were  to 
operate  in  the  decimal  system. 

The  same  objection  applies  to  any  base  hi  for  which  the  digital 
repre,sentation  in  question  is  redundant — i.e.  when  2(m  —  1)  >  m. 
Now  2(m  —  1)  >  m  whenever  m  >  2,  but  2(m  —  1)  =  m  for 
in  =  2.  Hence,  with  the  use  of  a  register  which  we  have  so  far 
contemplated,  this  division  scheme  is  certainly  excluded  from  the 
start  unless  the  binary  system  is  used. 

Let  us  now  investigate  the  situation  in  the  binarv  system.  We 
inquire  if  it  is  possible  to  obtain  a  quasi-quotient  by  using  the 
non-restoring  scheme  and  by  using  the  digits  1,  0  instead  of  1, 
—  1.  Or  rather  we  have  to  ask  this  question:  Does  this  quasi- 
quotient  bear  a  simple  relationship  to  the  true  quotient? 

Let  us  momentarily  assume  this  question  can  be  answered 
aflBrmatively  and  describe  the  division  procedure.  We  store  the 
divisor  initially  in  Ac,  the  dividend  in  SR  and  wish  to  form  the 
quotient  in  AR.  We  now  either  add  or  subtract  the  contents  of 
SR  into  Ac,  according  to  whether  the  signs  in  Ac  and  SR  are 
opposite  or  the  same,  and  insert  correspondingly  a  0  or  1  in  the 
right-hand  place  of  .\R.  We  then  shift  both  .\c  and  .\R  one  place 
left,  with  electronic  shifters  that  are  parts  of  these  two  aggregates. 

At  this  point  we  interrupt  the  discussion  to  note  this:  multipli- 
cation required  an  ability  to  shift  right  in  both  Ac  and  AR  (cf. 
5.8).  We  have  now  found  that  division  similarly  requires  an  ability 
to  shift  left  in  both  Ac  and  AK  Hence  both  organs  must  be  able  to 
shift  both  ways  electronically.  Since  these  abilities  have  to  be 
present  for  the  implicit  needs  of  multiplication  and  division,  it  is  just 
as  well  to  make  use  of  them  explicitly  in  the  form  of  explicit  orders. 
These  are  the  orders  20, 21  of  Table  1,  and  of  Table  2,  Part  II.  It  will, 
however,  turn  out  to  be  convenient  to  arrange  some  details  in  the 
shifts,  when  thev  occur  explicitly  under  the  control  of  those  orders. 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


differently  from  when  they  occur  implicitly  under  the  control  of  a 
multiplication  or  a  division.  (For  these  things,  cf.  the  discussion  of 
the  shifts  near  the  end  of  5.8  and  in  the  third  remark  below  on  one 
hand,  and  in  the  third  remark  in  7.2,  Part  II,  on  the  other  hand.) 

Let  us  now  resume  the  discussion  of  the  division.  The  process 
described  above  will  have  to  be  repeated  as  many  times  as  the 
number  of  quotient  digits  that  we  consider  appropriate  to  produce 
in  this  way.  This  is  likely  to  be  39  or  40;  we  will  determine  the 
exact  number  further  below. 

In  this  process  we  formed  digits^;  =  0  or  1  for  the  quotient,  when 
the  digit  should  actually  have  been  —  —  1  or  1,  with  =  2^;  —  1. 
Thus  we  have  a  difference  between  the  true  quotient  ;  (based  on 
the  digits  and  the  quasi-quotient  z'  (based  on  the  digits  but 
at  the  same  time  a  one-to-one  connection.  It  would  be  easy  to 
establish  the  algebraical  expression  for  this  connection  between  z' 
and  z  directly,  but  it  seems  better  to  do  this  as  part  of  a  discussion 
which  clarifies  all  other  questions  connected  with  the  process  of 
division  at  the  same  time. 

We  first  make  some  general  remarks: 

First:  Let  x  be  the  dividend  and  y  the  divisor.  We  assume,  of 
course,  —  1  ^  .t  <  1,  —  1  g  i/  <  1.  It  will  be  found  that  our  pres- 
ent process  of  division  is  entirely  unaffected  by  the  signs  of  x  and 
y,  hence  no  further  restrictions  on  that  score  are  required. 

On  the  other  hand,  the  quotient  ;  =  x/y  must  also  fulfil 
—  1  ^  ~  <  1.  It  seems  somewhat  simpler  although  this  is  by  no 
means  necessary,  to  exclude  for  the  purposes  of  this  discussion 
;  =  —  I,  and  to  demand  |;|  <  1.  This  means  in  terms  of  the 
dividend  x  and  the  divisor  y  that  we  exclude  x  —  —y  and  assume 
1-^:1  <  !/■ 

Second:  The  division  takes  place  in  n  steps,  which  correspond 
to  the  ri  digits  .  .  .  ,  of  the  pseudo-quotient  z',  n  being  yet  to 
be  determined  (presumably  39  or  40).  Assume  that  the  k  —  1  first 
steps  {k  —  1,  .  .  .  ,  n)  have  already  taken  place,  having  produced 
the  k  —  I  first  digits:  .  .  .  ,  i'^^^;  and  that  we  are  now  at  the 
fcth  step,  involving  production  of  the  fcth  digit;  JJ|..  Assume 
furthermore,  that  Ac  now  contains  the  quantity  r^._j,  the  result 
of  the  k  —  I  first  steps.  (This  is  the  {k  —  l)st  partial  remainder. 
For  k  =  \  clearly  r,,  =  x.)  We  then  form  r^.  =  2r^._j  i/,  accord- 
ing to  whether  the  signs  of  >\_^  and  y  do  or  do  not  agree,  i.e. 

=  2r,_iffl!/ 
1^  fis  —  if  the  signs  of  r^_j  and  y  do  agree 
lis  -f-  if  the  signs  of  rj._j  and  y  do  not  agree 

Let  us  now  see  what  carries  may  originate  in  this  procedure. 
We  can  argvie  as  follows:  |    |  <  |  y  |  is  true  for  h  —  0[\  r^^  \  = 


1^1  <  I  ^r^d  if  it  is  true  ioi  h  =  k  —  1,  then  (4)  extends  it  to 
li  =  k  also,  since  rj._i  and  ffly  have  opposite  signs.  The  last  point 
may  be  elaborated  a  little  further:  because  of  the  opposite  signs 

\h\  =2|r,_i|  -  iy|  <2\y\  -  \y\  =  \y\ 

Hence  we  have  always  I  r^^  I  <  1 1/ 1 ,  and  therefore  o/orf/orf  |  r^.  |  <  1, 
i.e.  -1<  r,  <  I. 

Consequently  in  equation  (4)  one  summand  is  necessarily  >  —  2, 
<2,  the  other  is  ^1,  <1,  and  the  sum  is  >  — 1,  <1.  Hence  we 
may  carry  out  the  operations  of  (4)  modulo  2,  disregarding  any 
po.ssibihties  of  carries  beyond  the  2"  position,  and  the  resulting 
/■^  will  be  automatically  correct  (in  the  range  >  — 1,  <1). 

Third:  Note  however  that  the  sign  of  rj._j,  which  plays  an 
important  role  in  (4)  above,  is  only  then  correctly  determinable 
from  the  sign  digit,  if  the  number  from  which  it  is  derived  is  ^  —  1, 
<1.  (Cf.  the  discussion  in  5.7.)  This  requirement  however  is  met, 
as  we  saw  above,  by  rj._j,  but  not  necessarily  by  2r^_j.  Hence  the 
sign  of  rj._j  (i.e.  its  sign  digit)  as  required  by  (4),  must  be  sensed 
before  ;\_]  is  doubled. 

This  being  understood,  the  doubling  of  rj._j  may  be  performed 
as  a  simple  left  shift,  in  which  the  left-most  digit  (the  sign  digit) 
is  allowed  to  be  lost — this  corresponds  to  the  disregarding  of 
carries  beyond  the  2"  position,  which  we  recognized  above  as  being 
permissible  in  (4).  (Cf.  however.  Part  II,  Table  2,  for  another  sort 
of  left  shift  that  is  desirable  in  explicit  form,  i.e.  as  an  order.) 

Fourth:  Consider  now  the  precise  implication  of  (4)  above. 

=  1  or  0  corresponds  to  E]  =  —  or  -f ,  respectively.  Hence 
(4)  mav  be  written 

r,  =  2r,_i  +  (1  -  2e,)y 

i.e. 

2-*r,  =  2-<''-i»r^_i  -|-  (2"'  -  2-^'^-%)y 
Summing  over  k  =  I,  .  .  .  ,  ti  gives 
2-",„  =  .r  -t-  [  (1  _  2-)  -  ^Z-^''-^  ]  y 
i.e. 

-r  =  (  -  1      V  2-'"-%  +  2-"  j  y  + 

This  makes  it  clear,  that;  =  —  1  -|-  vj|;_j2-"-i'^;^  +  2""  corre- 
sponds to  true  quotient  z  =  x/y  and  2-''r„,  with  an  absolute  value 
<2~"  I !/ 1  ^  2^",  to  the  remainder.  Hence,  if  we  disregard  the  term 
- 1  for  a  moment  ^[,i'r,,  .  .  .  ,  |;,  I  are  the  n  -|-  1  first  digits  of 
what  may  be  used  as  a  tnie  quotient,  the  sign  digit  being  part 
of  this  sequence. 


Chapter  4  {  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  109 


Fifth:  If  we  do  not  wish  to  get  involved  in  more  complicated 
round-off  procedures  which  exceed  the  immediate  capacity  of  the 
onlv  available  adder  Ac,  then  the  above  result  suggests  that  we 
should  put  (I  +  1  =  40,  n  =  39.  The  J',,  .  .  .  ,  |3<,  are  then  39  digits 
of  the  quotient,  including  the  sign  digit,  but  not  including  the 
right-most  digit. 

The  right-most  digit  is  taken  care  of  by  placing  a  1  into  the 
right-most  stage  of  Ac. 

At  this  point  an  additional  argument  in  favor  of  the  procedure 
that  we  have  adopted  here  becomes  apparent.  The  procedure 
coincides  (without  a  need  for  any  further  corrections)  with  the 
second  round-off  procedure  that  we  discussed  in  5.12. 

There  remains  the  term  —1.  Since  this  applies  to  the  final 
result,  and  no  right  shifts  are  to  follow,  carries  which  might  go 
beyond  the  2"  position  may  be  disregarded.  Hence  this  amounts 
simply  to  changing  the  sign  digit  of  the  (juotient  Z:  replacing  0 
or  1  by  1  or  0,  respectively. 

This  concludes  our  discussion  of  the  division  scheme.  We  wish. 


however,  to  re-emphasize  two  very  distinctive  features  which  it 
possesses: 

First:  This  division  scheme  applies  equally  for  any  combina- 
tions of  signs  of  divisor  and  dividend.  This  is  a  characteristic  of 
the  non-restoring  division  schemes,  but  it  is  not  the  case  for  anv 
simple  known  multiplication  scheme.  It  will  be  remembered,  in 
particular,  that  our  multiplication  procedure  of  5.9  had  to  contain 
special  correcting  steps  for  the  cases  where  either  or  both  factors 
are  negative. 

Second:  This  division  scheme  is  practicable  in  the  binary  sys- 
tem only;  it  has  no  analog  for  any  other  base. 

This  method  of  binary  division  will  he  illustrated  on  some 
e.xamples  in  .5.15. 

5.15.  We  give  below  some  illustrative  examples  of  the  opera- 
tions of  binary  arithmetic  which  were  discussed  in  the  preceding 
sections. 

."Mthough  it  presented  no  difficulties  or  ambiguities,  it  seems 
best  to  begin  with  an  example  of  addition. 


Binary  notation 

Augend    0.010110011 

Addend   0.011010111 

Sum    0.110001010 

(Carries)   1111  111 

In  what  follows  we  will  not  show  the  carries  anv  more. 
We  form  the  negative  of  a  number  (cf.  5.7); 


Decimal  notation  (fractional  form) 
179  512 
215  512 


394  512 


Binary  notation 


Complement; 


0.101110100 
1.010001011 

 1 

1.010001100 


Decimal  notation  (fractional  form) 
372  512 


-140  512 


A  subtraction  (cf.  5.7): 

Binary  notation 

Subtrahend    0.011010111 

Minuend  0.110001010 

Complement  of  subtrahend   1.100101000 

 1 

Difference   0.010110011 


Decimal  notation  (fractional  form) 
215  512 


394  512 


-1     -t- 297/512 
179/512 


110  Part  2     The  instruction-set  processor:  main  line  computers 


Section  1     Processors  with  one  address  per  instruction 


Some  multiplications  (cf.  5.8  and  5.9): 

Binary  notation 

IVIultiplicand    0.101 

IVIultiplier   0.011 

Old 
0101 

0 


Product 


0.001111 


Binary  notation 

IVIultiplicand   ,  1.101 

Multiplier   1.011 


0101 
0101 
1 


.101111 

Correction  If   1  1 

1.110111 

Correction  2-^  (Complement  of  the  multiplicand).  0.010 

 1 

0.001111 


Decimal  notation  (fractional  form) 
5/8 

3/8 


15/64 

Decimal  notation  (fractional  form) 

-3/8 
-5/8 


15/64 


A  division  (cf.  5.14): 


Binary  notation 


Divisor 
Dividend 


1.011000 
0.001111 


0.011110 
1.011000 
1.110110 


1.101100 
0.100111 

1 

0.010100 


0.101000 
1.011000 
0.000000 


0.000000 
1.011000 
1.011000 


0.110000 
0.100111 

 1_ 

1.011000 


Quotient  (uncorrected) 
(corrected) 


0.10011 
1.100111 


QD.§ 
0 


Decimal  notation  (fractional  form) 
-5/8 
15/64 


- 1  +  39/64 


-25,'64 


t  For  the  sign  of  the  multiplicand.      f:  For  the  sign  of  the  multiplier.      §  Quotient  digit. 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  111 


Note  that  this  deviates  by  i.e.  by  one  unit  of  the  right-most 
position,  from  the  correct  result  —  %.  This  is  a  consequence  of 
our  round-off  rule,  which  forces  the  right-most  digit  to  be  1  under 
all  conditions.  This  occasionally  produces  results  with  unfamiliar 
and  even  annoying  aspects  (e.g.  when  quotients  like  0:y  or  y.i/ 
are  formed),  but  it  is  nevertheless  unobjectionable  and  self- 
consistent  on  the  basis  of  our  general  principles. 

6.    The  control 

6.1.  it  has  already  been  stated  that  the  computer  will  contain 
an  organ,  called  the  control,  which  can  automatically  execute  the 
orders  stored  in  the  Selectrons.  Actually,  for  a  reason  stated  in 
6. .3,  the  orders  for  this  computer  are  less  than  half  as  long  as  a 
forty  binary  digit  number,  and  hence  the  orders  are  stored  in  the 
Selectron  memory  in  pairs. 

Let  us  consider  the  routine  that  the  control  performs  in  direct- 
ing a  computation.  The  control  must  know  the  location  in  the 
Selectron  memory  of  the  pair  of  orders  to  be  executed.  It  must 
direct  the  Selectrons  to  transmit  this  pair  of  orders  to  the  Selectron 
register  and  then  to  itself.  It  must  then  direct  the  execution  of 
the  operation  specified  in  the  first  of  the  two  orders,  .\mong  these 
orders  we  can  immediately  describe  two  major  types:  An  order 
of  the  first  type  begins  bv  causing  the  transfer  of  the  number, 
which  is  stored  at  a  specified  memory  location,  from  the  Selectrons 
to  the  Selectron  register.  Next,  it  causes  the  arithmetical  unit  to 
perform  some  arithmetical  operations  on  this  number  (usually  in 
conjunction  with  another  number  which  is  already  in  the  arith- 
metical unit),  and  to  retain  the  resulting  number  in  the  arith- 
metical unit.  The  second  type  order  causes  the  transfer  of  the 
number,  which  is  held  in  the  arithmetical  unit,  into  the  Selectron 
register,  and  from  there  to  a  specified  memory  location  in  the 
Selectrons.  (It  may  also  be  that  this  latter  operation  will  permit 
a  direct  transfer  from  the  arithmetical  unit  into  the  Selectrons.) 
An  additional  type  of  order  consists  of  the  transfer  orders  of  3.5. 
Further  orders  control  the  inputs  and  the  outputs  of  the  machine. 
The  process  described  at  the  beginning  of  this  paragraph  must 
then  be  repeated  with  the  second  order  of  the  order  pair.  This 
entire  routine  is  repeated  until  the  end  of  the  problem. 

6.2.  It  is  clear  from  what  has  just  been  stated  that  the  control 
must  have  a  means  of  switching  to  a  specified  location  in  the 
Selectron  memory,  for  withdrawing  both  numbers  for  the  compu- 
tation and  pairs  of  orders.  Since  the  Selectron  memory  (as  tenta- 
tively planned)  will  hold  2^-  =  4,096  forty-digit  words  (a  word  is 
either  a  number  or  a  pair  of  orders),  a  twelve-digit  binary  number 
suffices  to  identify  a  memory  location.  Hence  a  switching  mecha- 


nism is  required  which  will,  on  receiving  a  twelve-digit  binary 
number,  select  the  corresponding  memory  location. 

The  type  of  circuit  we  propose  to  use  for  this  purpose  is  known 
as  a  decoding  or  many-one  function  table.  It  has  been  developed 
in  various  forms  independently  by  J.  Rajchman  [Rajchman,  1943] 
and  P.  Crawford  [Crawford,  19??].  It  consists  of  n  flip-flops  which 
register  an  n-digit  binary  number.  It  also  has  a  maximum  of  2" 
output  wires.  The  flip-flops  activate  a  matrix  in  which  the  inter- 
connections between  input  and  output  wires  are  made  in  such  a 
way  that  one  and  only  one  of  2"  output  wires  is  selected  (i.e.  has 
a  positive  voltage  applied  to  it).  These  interconnections  may  be 
established  by  means  of  resistors  or  by  means  of  non-linear  ele- 
ments (such  as  diodes  or  rectifiers);  all  these  various  methods  are 
under  investigation.  The  Selectron  is  so  designed  that  four  such 
function  table  switches  are  required,  each  with  a  three  digit  entry 
and  eight  (2'')  outputs.  Four  sets  of  eight  wires  each  are  brought 
out  of  the  Selectron  for  switching  purposes,  and  a  particular  loca- 
tion is  selected  by  making  one  wire  positive  with  respect  to  the 
remainder.  Since  all  forty  Selectrons  are  switched  in  parallel,  these 
four  sets  of  wires  may  be  connected  directly  to  the  four  function 
table  outputs. 

6.3.  Since  most  computer  operations  involve  at  least  one 
number  located  in  the  Selectron  memory,  it  is  reasonable  to  adopt 
a  code  in  which  twelve  binary  digits  of  every  order  are  assigned 
to  the  specification  of  a  Selectron  location.  In  those  orders  which 
do  not  require  a  number  to  be  taken  out  of  or  into  the  Selectrons 
these  digit  positions  will  not  be  used. 

Though  it  has  not  been  definitely  decided  how  many  operations 
will  be  built  into  the  computer  (i.e.  how  many  different  orders 
the  control  must  be  able  to  understand),  it  will  be  seen  presently 
that  there  will  probably  be  more  than  2'  but  certainly  less  than 
2''.  For  this  reason  it  is  feasible  to  assign  6  binary  digits  for  the 
order  code.  It  thus  turns  out  that  each  order  must  contain  eighteen 
binary  digits,  the  first  twelve  identifying  a  memory  location  and 
the  remaining  six  specifying  an  operation.  It  can  now  be  explained 
why  orders  are  stored  in  the  memory  in  pairs.  Since  the  same 
memory  organ  is  to  be  used  in  this  computer  for  both  orders  and 
numbers,  it  is  efficient  to  make  the  length  of  each  about  equivalent. 
But  numbers  of  eighteen  binary  digits  would  not  be  sufficiently 
accurate  for  problems  which  this  machine  will  solve.  Rather,  an 
accuracy  of  at  least  10"'"  or  2"''^  is  required.  Hence  it  is  preferable 
to  make  the  numbers  long  enough  to  accommodate  two  orders. 

As  we  pointed  out  in  2.3,  and  used  in  4.2  et  seq.  and  5.7  et 
seq.,  our  numbers  will  actually  have  40  binar)-  digits  each.  This 
allows  20  binary  digits  for  each  order,  i.e.  the  12  digits  that  specify 
a  memory  location,  and  8  more  digits  specifying  the  nature  of  the 


112  Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1      Processors  with  one  address  per  instruction 


operation  (instead  of  the  minimum  of  6  referred  to  above).  It  is 
convenient,  as  will  be  seen  in  6.8.2.  and  Chapter  9,  Part  II,  to 
group  these  binary  digits  into  tetrads,  groups  of  4  binary  digits. 
Hence  a  whole  word  consists  of  10  tetrads,  a  half  word  or  order 
of  5  tetrads,  and  of  these  3  specify  a  memory  location  and  the 
remaining  2  specify  the  nature  of  the  operation.  Outside  the 
machine  each  tetrad  can  be  expressed  by  a  base  16  digit.  (The 
base  16  digits  are  best  designated  by  symbols  of  the  10  decimal 
digits  0  to  9,  and  6  additional  symbols,  e.g.  the  letters  a  to  f.  Cf. 
Chapter  9,  Part  II.)  These  16  characters  should  appear  in  the 
typing  for  and  the  printing  from  the  machine.  (For  further  details 
of  these  arrangements,  cf.  loc.  cit.  above.) 

The  specification  of  the  nature  of  the  operation  that  is  involved 
in  an  order  occurs  in  binary  form,  so  that  another  many-one  or 
decoding  function  is  required  to  decode  the  order.  This  function 
table  will  have  six  input  flip-flops  (the  two  remaining  digits  of  the 
order  are  not  needed).  Since  there  will  not  be  64  difi^erent  orders, 
not  all  64  outputs  need  be  provided.  However,  it  is  perhaps 
worthwhile  to  connect  the  outputs  corresponding  to  unused  order 
possibilities  to  a  checking  circuit  which  will  give  an  indication 
whenever  a  code  word  unintelligible  to  the  control  is  received 
in  the  input  flip-flops. 

The  function  table  just  described  energizes  a  different  output 
wire  for  each  different  code  operation.  As  will  be  shown  later, 
many  of  the  steps  involved  in  executing  different  orders  overlap. 
(For  example,  addition,  multiplication,  division,  and  going  from 
the  Selectrons  to  the  register  all  include  transferring  a  number  from 
the  Selectrons  to  the  Selectron  register.)  For  this  reason  it  is 
perhaps  desirable  to  have  an  additional  set  of  control  wires,  each 
of  which  is  activated  by  any  particular  combination  of  different 
code  digits.  These  may  be  obtained  by  taking  the  output  wires 
of  the  many-one  function  table  and  using  them  to  operate  tubes 
which  will  in  turn  operate  a  one-many  (or  coding)  function  table. 
Such  a  fimction  table  consists  of  a  matrix  as  before,  but  in  this 
case  only  one  of  the  input  wires  are  activated.  This  particular  table 
may  be  referred  to  as  the  recoding  function  table. 

The  twelve  flip-flops  operating  the  four  function  tables  used 
in  selecting  a  Selectron  position,  and  the  six  flip-flops  operating 
the  fimction  table  used  for  decoding  the  order,  are  referred  to  as 
the  Function  Table  Register,  FR. 

6.4.  Let  us  consider  next  the  process  of  transferring  a  pair 
of  orders  from  the  Selectrons  to  the  control.  These  orders  first  go 
into  SR.  The  order  which  is  to  be  used  next  may  be  transferred 
directly  into  FR.  The  second  order  of  the  pair  must  be  removed 
from  SR  (since  SR  may  be  used  when  the  first  order  is  executed), 
but  cannot  as  yet  be  placed  in  FR.  Hence  a  temporary  storage 


is  provided  for  it.  The  storage  means  is  called  the  Control  Register, 
CR,  and  consists  of  20  (or  possibly  18)  flip-flops,  capable  of  re- 
ceiving a  number  from  SR  and  transmitting  a  number  to  FR. 

As  already  stated  (6.1),  the  control  must  know  the  location  of 
the  pair  of  orders  it  is  to  get  from  the  Selectron  memory.  Normally 
this  location  will  be  the  one  following  the  location  of  the  two 
orders  just  executed.  That  is,  until  it  receives  an  order  to  do 
otherwise,  the  control  will  take  its  orders  from  the  Selectrons  in 
sequence.  Hence  the  order  location  may  be  remembered  in  a 
twelve  stage  binary  counter  (one  capable  of  counting  2'-)  to  which 
one  unit  is  added  whenever  a  pair  of  orders  is  executed.  This 
counter  is  called  the  Control  Counter,  CC. 

The  details  of  the  process  of  obtaining  a  pair  of  orders  from 
the  Selectron  are  thus  as  follows:  The  contents  of  CC  are  copied 
into  FR,  the  proper  Selectron  location  is  selected,  and  the  contents 
of  the  Selectrons  are  transferred  to  SR.  FR  is  then  cleared,  and 
the  contents  of  SR  are  transferred  to  it  and  CR.  CC  is  advanced 
by  one  unit  so  the  control  will  be  prepared  to  select  the  next  pair 
of  orders  from  the  memory.  (There  is,  however,  an  exception  from 
this  last  rule  for  the  so-called  transfer  orders,  cf.  .3.5.  This  may 
feed  CC  in  a  different  manner,  cf.  the  next  paragraph  below.)  First 
the  order  in  FR  is  executed  and  then  the  order  in  CR  is  transferred 
to  FR  and  executed.  It  should  be  noted  that  all  these  operations 
are  directed  by  the  control  itself — not  only  the  operations  specified 
in  the  control  words  sent  to  FR,  but  also  the  automatic  operations 
required  to  get  the  correct  orders  there. 

Since  the  method  by  means  of  which  the  control  takes  order 
pairs  in  sequence  from  the  memory  has  been  described,  it  only 
remains  to  consider  how  the  control  shifts  itself  from  one  sequence 
of  control  orders  to  another  in  accordance  with  the  operations 
described  in  .3.5.  The  execution  of  these  operations  is  relatively 
simple.  An  order  calling  for  one  of  these  operations  contains  the 
twelve  digit  specification  of  the  position  to  which  the  control  is 
to  be  switched,  and  these  digits  will  appear  in  the  left-hand  twelve 
flip-flops  of  FR.  All  that  is  required  to  shift  the  control  is  to  transfer 
the  contents  of  these  flip-flops  to  CC.  When  the  control  goes  to 
the  Selectrons  for  the  next  pair  of  orders  it  will  then  go  to  the 
location  specified  by  the  number  so  transferred.  In  the  case  of  the 
unconditional  transfer,  the  transfer  is  made  automatically;  in  the 
case  of  the  conditional  transfer  it  is  made  only  if  the  sign  counter 
of  the  Accumulator  registers  zero. 

6.5.  In  this  report  we  will  discuss  only  the  general  method 
by  means  of  which  the  control  will  execute  specific  orders,  leaving 
the  details  until  later.  It  has  already  been  explained  (5.5)  that  when 
a  circuit  is  to  be  designed  to  accomplish  a  particular  elementary 
operation  (such  as  addition),  a  choice  must  be  made  between  a 


Chapter  4  j  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  113 


Static  type  and  a  dynamic  type  circuit.  When  the  design  of  the 
control  is  considered,  this  same  choice  arises.  The  fimction  of  the 
control  is  to  direct  a  sequence  of  operations  which  take  place  in 
the  various  circuits  of  the  computer  (including  the  circuits  of  the 
control  itself).  Consider  what  is  involved  in  directing  an  operation. 
The  control  must  signal  for  the  operation  to  begin,  it  must  supply 
whatever  signals  are  required  to  specify  that  particular  operation, 
and  it  must  in  some  way  know  when  the  operation  has  been 
completed  so  that  it  may  start  the  succeeding  operation.  Hence 
the  control  circuits  must  be  capable  of  timing  the  operations.  It 
should  be  noted  that  timing  is  required  whether  the  circuit  per- 
forming the  operation  is  static  or  dvnamic.  In  the  case  of  a  static 
type  circuit  the  control  must  supply  static  control  signals  for  a 
period  of  time  sufficient  to  allow  the  output  voltages  to  reach  the 
steady-state  condition.  In  the  case  of  a  dynamic  type  circuit  the 
control  must  send  various  pulses  at  proper  intervals  to  this  circuit. 

If  all  circuits  of  a  computer  are  static  in  character,  the  control 
timing  circuits  may  likewise  be  static,  and  no  pulses  are  needed 
in  the  system.  However,  though  some  of  the  circuits  of  the  com- 
puter we  are  planning  will  be  static,  they  will  probably  not  all 
be  so,  and  hence  pulses  as  well  as  static  signals  must  be  supplied 
by  the  control  to  the  rest  of  the  computer.  There  are  many  advan- 
tages in  deriving  these  pulses  from  a  central  source,  called  the 
clock.  The  timing  may  then  be  done  either  by  means  of  counters 
counting  clock  pulses  or  by  means  of  electrical  delay  lines  (an  RC 
circuit  is  here  regarded  as  a  simple  delay  line).  Since  the  timing 
of  the  entire  computer  is  governed  bv  a  single  pulse  source,  the 
computer  circuits  will  be  said  to  operate  as  a  synchronized  system. 

The  clock  plays  an  important  role  both  in  detecting  and  in 
localizing  the  errors  made  by  the  computer.  One  method  of  check- 
ing which  is  under  consideration  is  that  of  having  two  identical 
computers  which  operate  in  parallel  and  automatically  compare 
each  other's  results.  Both  machines  would  be  controlled  by  the 
same  clock,  so  they  would  operate  in  absolute  svnchronism.  It  is 
not  necessary  to  compare  everv  flip-flop  of  one  machine  with  the 
corresponding  flip-flop  of  the  other.  Since  all  numbers  and  control 
words  pass  through  either  the  Selectron  register  or  the  accumu- 
lator soon  before  or  soon  after  they  are  used,  it  suffices  to  check 
the  flip-flops  of  the  Selectron  register  and  the  flip-flops  of  the 
accumulator  which  hold  the  number  registered  there;  in  fact,  it 
seems  possible  to  check  the  accumulator  only  (cf.  the  end  of  6.6.2). 
The  checking  circuit  would  stop  the  clock  whenever  a  difference 
appeared,  or  stop  the  machine  in  a  more  direct  manner  if  an 
asynchronous  system  is  used.  Everv  flip-flop  of  each  computer  will 
be  located  at  a  convenient  place.  In  fact,  all  neons  will  be  located 
on  one  panel,  the  corresponding  neons  of  the  two  machines  being 


placed  in  parallel  rows  so  that  one  can  tell  at  a  glance  (after  the 
machine  has  been  stopped)  where  the  discrepancies  are. 

The  merits  of  any  checking  system  must  be  weighed  against 
its  cost.  Building  two  machines  may  appear  to  be  e.xpensive,  but 
since  most  of  the  cost  of  a  scientific  computer  lies  in  development 
rather  than  production,  this  consideration  is  not  so  important  as 
it  might  seem.  Experience  may  show  that  for  most  problems  the 
two  machines  need  not  be  operated  in  parallel.  Indeed,  in  most 
cases  purely  mathematical,  external  checks  are  possible:  Smooth- 
ness of  the  results,  behavior  of  differences  of  various  types,  validity 
of  suitable  identities,  redundant  calculations,  etc.  All  of  these 
methods  are  usually  adequate  to  disclose  the  presence  or  absence 
of  error  in  toto;  their  drawback  is  onlv  that  they  may  not  allow 
the  detailed  diagnosing  and  locating  of  errors  at  all  or  with  ease. 
When  a  problem  is  run  for  the  first  time,  so  that  it  requires  special 
care,  or  when  an  error  is  known  to  be  present,  and  has  to  be 
located — only  then  will  it  be  necessary  as  a  rule,  to  use  both 
machines  in  parallel.  Thus  they  can  be  used  as  separate  machines 
most  of  the  time.  The  essential  feature  of  such  a  method  of  check- 
ing lies  in  the  fact  that  it  checks  the  computation  at  everv  point 
(and  hence  detects  transient  errors  as  well  as  steadv-state  ones) 
and  stops  the  machine  when  the  error  occurs  so  that  the  process 
of  localizing  the  fault  is  greatlv  simplified.  These  advantages  are 
only  partially  gained  by  duplicating  the  arithmetic  part  of  the 
computer,  or  by  following  one  operation  with  the  complement 
operation  (multiplication  by  division,  etc.),  since  this  fails  to  check 
either  the  memory  or  the  control  (which  is  the  most  complicated, 
though  not  the  largest,  part  of  the  machine). 

The  method  of  localizing  errors,  either  with  or  without  a  dupli- 
cate machine,  needs  further  discussion.  It  is  planned  to  design  all 
the  circuits  (including  those  of  the  control)  of  the  computer  so 
that  if  the  clock  is  stopped  between  pulses  the  computer  will 
retain  all  its  information  in  flip-flops  so  that  the  computation  may 
proceed  unaltered  when  the  clock  is  started  again.  This  principle 
has  already  demonstrated  its  usefulness  in  the  ENTAC.  This  makes 
it  possible  for  the  machine  to  compute  with  the  clock  operating 
at  any  speed  below  a  certain  maximum,  as  long  as  the  clock  gives 
out  pulses  of  constant  shape  regardless  of  the  spacing  between 
pulses.  In  particular,  the  spacing  between  pulses  may  be  made 
indefinitely  large.  The  clock  will  be  provided  with  a  mode  of 
operation  in  which  it  will  emit  a  single  pulse  whenever  instructed 
to  do  so  by  the  operator.  By  means  of  this,  the  operator  can  cause 
the  machine  to  go  through  an  operation  step  by  step,  checking 
the  results  bv  means  of  the  indicating-lamps  connected  to  the 
flip-flops.  It  will  be  noted  that  this  design  principle  does  not 
exclude  the  use  of  delay  lines  to  obtain  delays  as  long  as  these 


114  Part  2  I  The  instruction-set  processor:  main-line  computers  Section  1  |  Processors  with  one  address  per  instruction 


are  only  used  to  time  the  constituent  operations  of  a  single  step, 
and  have  no  part  in  determining  the  machine's  operating  repeti- 
tion rate.  Timing  coincidences  by  means  of  delay  lines  is  excluded 
since  this  requires  a  constant  pulse  rate. 

6.6.  The  orders  which  the  control  understands  may  be  divided 
into  two  groups:  Those  that  specify  operations  which  are  per- 
formed within  the  computer  and  those  that  specify  operations 
involved  in  getting  data  into  and  out  of  the  computer.  At  the 
present  time  the  internal  operations  are  more  completely  planned 
than  the  input  and  output  operations,  and  hence  they  will  be 
discussed  more  in  detail  than  the  latter  (which  are  treated  briefly 
in  6.8).  The  internal  operations  which  have  been  tentatively 
adopted  are  listed  in  Table  1.  It  has  already  been  pointed  out  that 
not  all  of  these  operations  are  logically  basic,  but  that  many  can 
be  programmed  bv  means  of  others.  In  the  case  of  some  of  these 
operations  the  reasons  for  building  them  into  the  control  have 
already  been  given.  In  this  section  we  will  give  reasons  for  building 
the  other  operations  into  the  control  and  will  explain  in  the  case 
of  each  operation  what  the  control  must  do  in  order  to  exe- 
cute it. 

In  order  to  have  the  precise  mathematical  meaning  of  the 
symbols  which  are  introduced  in  what  follows  clearly  in  mind, 
the  reader  should  consult  the  table  at  the  end  of  the  report  for 
each  new  symbol,  in  addition  to  the  explanations  given  in  the  text. 

6.6.1.  Throughout  what  follows  S(.t)  will  denote  the  memory 
location  No.  .v  in  the  Selectron.  Accordingly  the  .v  which  appears 
in  S(.v)  is  a  I2-digit  binary,  in  the  sense  of  6.2.  The  eight  addition 
operations  [S(x)  — >  Ac  -(- ,  S{x)  — >  Ac  — ,  S{x)^  \h  +  ,  S(i)  — >  A/i  — , 
S(.v)  ^  Ac  -I-  M,  S(x)  ^  Ac  -  M,  S(x)  ^  Ah  +  M,  S{x)  ->  A/i  -  M] 
involves  the  following  possible  four  steps: 

First:  Clear  SR  and  transfer  into  it  the  number  at  S(.\). 

Second:  Clear  Ac  if  the  order  contains  the  symbol  c;  do  not 
clear  Ac  if  the  order  contains  the  svmbol  h. 

Third:  Add  the  number  in  SR  or  its  negative  (i.e.  in  our  present 
system  its  complement  with  respect  to  2^)  into  Ac.  If  the  order  does 
not  contain  the  symbol  M,  use  the  niunber  in  SR  or  its  negative 
according  to  whether  the  order  contains  the  svmbol  -|-  or  — .  If  the 
order  contains  the  symbol  M,  use  the  number  in  SR  or  its  negative 
according  to  whether  the  sign  of  the  number  in  SR  and  the  symbol 
-I-  or  —  in  the  order  do  or  do  not  agree. 

Fourth:  Perform  a  complete  carry.  Building  the  last  four  addi- 
tion operations  (those  containing  the  svmbol  M)  into  the  control 
is  fairly  simple:  It  calls  only  for  one  extra  comparison  (of  the  sign 
in  SR  and  the  -I-  or  —  in  the  order,  cf.  the  third  step  above),  and 
it  requires,  therefore,  only  a  few  tubes  more  than  required  for  the 
first  four  addition  operations  (those  not  containing  the  symbol  M). 


These  facts  would  seem  of  themselves  to  justify  adding  the  opera- 
tions in  question:  plus  and  minus  the  absolute  value.  But  it  should 
be  noted  that  these  operations  can  be  programmed  out  of  the  other 
operations  of  Table  1  with  correspondingly  few  orders  (three  for 
absolute  value  and  five  for  minus  absolute  value),  so  that  some 
further  justification  for  building  them  in  is  required.  The  absolute 
value  order  is  frequently  in  connection  with  the  orders  L  and  R 
(see  6.6.7),  while  the  minus  absolute  value  order  makes  the  detec- 
tion of  a  zero  very  simple  by  merely  detecting  the  sign  of  —  |  iV| . 
(If  -\N\  ^0,  then  N  =  0.) 

6.6.2.  The  operation  of  S(.r)  — >  R  involves  the  following  two 
steps: 

First:  Clear  SR,  and  transfer  S(.v)  to  it. 

Second:  Clear  AR  and  add  the  number  in  the  Selectron  register 
into  it.  The  operation  of  R  — >  Ac  merits  more  detailed  discussion, 
since  there  are  alternative  ways  of  removing  numbers  from  AR. 
Such  numbers  could  be  taken  directly  to  the  Selectrons  as  well 
as  into  Ac,  and  they  could  be  transferred  to  Ac  in  parallel,  in 
sequence,  or  in  sequence  parallel.  It  should  be  recalled  that  while 
most  of  the  numbers  that  go  into  AR  have  come  from  the  Selec- 
trons and  thus  need  not  be  returned  to  them,  the  result  of  a 
division  and  the  right-hand  39  digits  of  a  product  appear  in  AR. 
Hence  while  an  operation  for  withdrawing  a  number  from  .AR  is 
required,  it  is  relatively  infrequent  and  therefore  need  not  be 
particularly  fast.  We  are  therefore  considering  the  possibility  of 
transferring  at  least  partially  in  sequence  and  of  using  the  .shifting 
properties  of  Ac  and  of  AR  for  this.  Transferring  the  number  to 
the  Selectron  via  the  accumulator  is  also  desirable  if  the  dual 
machine  method  of  checking  is  employed,  for  it  means  that  even 
if  numbers  are  only  checked  in  their  transit  through  the  accumu- 
lator, nevertheless  every  number  going  into  the  Selectron  is 
checked  before  being  placed  there. 

6.6..3.  The  operation  S(.r)  X  R  — >  Ac  involves  the  following  six 
steps: 

First:  Clear  SR  and  transfer  S(.v)  (the  multiplicand)  into  it. 

Second:  Thirty-nine  steps,  each  of  which  consist  of  the  two 
following  parts:  (a)  Add  (or  rather  shift)  the  sign  digit  of  SR  into 
the  partial  product  in  Ac,  or  add  all  but  the  sign  digit  of  SR  into 
the  partial  product  in  Ac — depending  upon  whether  the  right-most 
digit  in  AR  is  0  or  I — and  effect  the  appropriate  carries,  (b)  Shift 
Ac  and  ,\R  to  the  right,  fill  the  sign  digit  of  Ac  with  a  0  and  the 
digit  of  AR  immediately  right  of  the  sign  digit  (positional  value 
2~^)  with  the  previously  right-most  digit  of  Ac.  (There  are  ways 
to  save  time  by  merging  these  two  operations  when  the  right-most 
digit  in  Ar  is  0,  but  we  will  not  discuss  them  here  more  fully.) 

Third:  If  the  sign  digit  in  SR  is  I  (i.e.  — ),  then  inject  a  carry 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  115 


into  the  right-most  stage  of  Ac  and  place  a  1  into  the  sign  digit 
of  Ac. 

Fourth:  If  the  original  sign  digit  of  AR  is  1  (i.e.  — ),  then  sub- 
tract the  contents  of  SR  from  Ac. 

Fifth:  If  a  partial  carry  system  was  employed  in  the  main 
process,  then  a  complete  carry  is  necessary  at  the  end. 

Sixth:  The  appropriate  round-off  must  be  effected.  (Cf.  Chapter 
9,  Part  II,  for  details,  where  it  is  also  explained  how  the  sign  digit 
of  the  Arithmetic  register  is  treated  as  part  of  the  round-off 
process.) 

It  will  be  noted  that  since  anv  number  held  in  Ac  at  the  begin- 
ning of  the  process  is  gradually  shifted  into  AR.  it  is  impossible 
to  accumulate  sums  of  products  in  Ac  without  storing  the  various 
products  temporarily  in  the  Selectrons.  While  this  is  undoubtedly 
a  disadvantage,  it  cannot  be  eliminated  without  constructing  an 
extra  register,  and  this  does  not  at  this  moment  seem  worthwhile. 

On  the  other  hand,  saving  the  right-hand  39  digits  of  the  answer 
is  accomplished  with  ver\'  little  extra  equipment,  since  it  means 
connecting  the  2"^-'  stage  of  Ac  to  the  2  '  stage  of  AR  during  the 
shift  operation.  The  advantage  of  saving  these  digits  is  that  it 
simplifies  the  handling  of  numbers  of  any  number  of  digits  in  the 
computer  (cf.  the  last  part  of  5.12).  Any  number  of  39k  binary 
digits  (where  k  is  an  integer)  and  sign  can  be  divided  into  k  parts, 
each  part  being  placed  in  a  separate  Selectron  position.  Addition 
and  subtraction  of  such  numbers  niav  be  programmed  out  of  a 
series  of  additions  or  subtractions  of  the  39-digit  parts,  the  carry- 
over being  programmed  by  means  of  Cc  S(.v)  and  Cc'  — »  S(.\) 
operations.  (If  the  2"  stage  of  Ac  registers  negative  after  the  addi- 
tion of  two  .39-digit  parts,  a  carry-over  has  taken  place  and  hence 
2~^^  must  be  added  to  the  sum  of  the  next  parts.)  A  similar  proce- 
dure may  be  followed  in  multiplication  if  all  78  digits  of  the 
product  of  the  two  39-digit  parts  are  kept,  as  is  planned.  (For  the 
details,  cf.  Chapter  9,  Part  II.)  Since  it  would  greatly  complicate 
the  computer  to  make  provision  for  holding  and  using  a  78  digit 
dividend,  it  is  planned  to  program  39k  digit  division  in  one  of  the 
wavs  described  at  the  end  of  5.12. 

6.6.4.  The  operation  of  division  Ac  S(.v)  R  involves  the 
following  four  steps: 

First:  Clear  SR  and  transfer  S(.v)  (the  divisor)  into  it. 

Second:  Clear  AR. 

Third:  Thirty-nine  steps,  each  of  which  consists  of  the  following 
three  parts:  (a)  Sense  the  signs  of  the  contents  of  Ac  (the  partial 
remainder)  and  of  SR,  and  sense  whether  they  agree  or  not.  (b) 
Shift  Ac  and  AR  left.  In  this  process  the  previous  sign  digit  of 
Ac  is  lost.  Fill  the  right-most  digit  of  Ac  (after  the  shift)  with  a 
0,  and  the  right-most  digit  of  AR  (before  the  shift)  with  0  or  1, 


depending  on  whether  there  was  disagreement  or  agreement  in 
(a),  (c)  Add  or  subtract  the  contents  of  SR  into  Ac,  depending  on 
the  same  alternative  as  above. 

Fourth:  Fill  the  right-most  digit  of  AR  with  a  1,  and  change 
its  sign  digit. 

For  the  purpose  of  timing  the  39  steps  involved  in  division  a 
six-stage  counter  (capable  of  coimting  to  2''  =  64)  will  be  built 
into  the  control.  This  same  counter  will  also  be  used  for  timing 
the  .39  steps  of  multiplication,  and  possibly  for  controlling  Ac  when 
a  number  is  being  transferred  between  it  and  a  tape  in  either 
direction  (see  6.8.). 

6.6.5.  The  three  substitution  operations  [At  S(.r),  Ap  S(x), 
and  Ap'  — »  S(i)]  involve  transferring  all  or  part  of  the  number  held 
in  Ac  into  the  Selectrons.  This  will  be  done  by  means  of  gate  tubes 
connected  to  the  registering  flip-flops  of  Ac.  Forty  such  tubes  are 
needed  for  the  total  substitutions,  At—>  S(.t).  The  partial  substitu- 
tion Ap  — >  S(.t)  and  Ap'  — »  S(.r)  requires  that  the  left-hand  twelve 
digits  of  the  number  held  in  Ac  be  substituted  in  the  proper  places 
in  the  left-hand  and  right-hand  orders,  respectively.  This  may  be 
done  by  means  of  extra  gate  tubes,  or  by  shifting  the  number  in 
Ac  and  using  the  gate  tubes  required  for  At  S(.r).  (This  scheme 
needs  some  additional  elaboration,  when  the  order  directing  and 
the  order  suffering  the  substitution  are  the  two  successive  halves 
of  the  same  word;  i.e.  when  the  latter  is  already  in  FR  at  the  time 
when  the  former  becomes  operative  in  CR,  so  that  the  substitution 
effected  in  the  Selectrons  comes  too  late  to  alter  the  order  which 
has  already  reached  CR,  to  become  operative  at  the  ne.xt  step  in 
FR.  There  are  various  ways  to  take  care  of  this  complication,  either 
by  some  additional  equipment  or  by  appropriate  prescriptions  in 
coding.  We  will  not  discuss  them  here  in  more  detail,  since  the 
decisions  in  this  respect  are  still  open.) 

The  importance  of  the  partial  substitution  operations  can 
hardly  be  overestimated.  It  has  already  been  pointed  out  (3.3)  that 
they  allow  the  computer  to  perform  operations  it  could  not  other- 
wise conveniently  perform,  such  as  making  use  of  a  fimction  table 
stored  in  the  Selectron  memory.  Furthermore,  these  operations 
remove  a  very  sizeable  burden  from  the  person  coding  problems, 
for  they  make  possible  the  coding  of  classes  of  problems  in  contrast 
to  coding  each  individual  problem  separately.  Because  Ap  — »  S(.v) 
and  Ap'  S(.v)  are  available,  any  program  sequence  may  be  stated 
in  general  form  (that  is,  without  Selectron  location  designations 
for  the  numbers  being  operated  on)  and  the  Selectron  locations 
of  the  numbers  to  be  operated  on  substituted  whenever  that  se- 
quence is  used.  As  an  example,  consider  a  general  code  for  nth 
order  integration  of  m  total  differential  equations  for  p  steps  of 
independent  variable  t.  formulated  in  advance.  Whenever  a  prob- 


116  Part  2  I  The  instruction-set  processor:  main  line  computers  Section  1  |  Processors  with  one  address  per  instruction 


leni  requiring  this  rule  is  coded  for  the  computer,  the  general 
integration  sequence  can  be  inserted  into  the  statement  of  the 
problem  along  with  coded  instructions  for  telling  the  sequence 
where  it  will  be  located  in  the  memory  [so  that  the  proper  S(.v) 
designations  will  be  inserted  into  such  orders  as  Cu— >  S{x),  etc.]. 
Whenever  this  sequence  is  to  be  used  by  the  computer  it  will 
automatically  substitute  the  correct  values  of  m,  n,  p  and  At,  as 
well  as  the  locations  of  the  boundary  conditions  and  the  descrip- 
tions of  the  differential  equations,  into  the  general  sequence.  (For 
the  details  of  this  particular  procedure,  cf.  Chapter  13,  Part  II.) 
A  library  of  such  general  sequences  will  be  built  up,  and  facilities 
provided  for  convenient  insertion  of  any  of  these  into  the  coded 
statement  of  a  problem  (cf.  6.8.4).  When  such  a  scheme  is  used, 
only  the  distinctive  features  of  a  problem  need  be  coded. 

6.6.6.  The  manner  in  which  the  control  shift  operations 
[Cu  S(x),  Cu'  S(.t),  Cc  S(.v),  and  Cc'  S(.v)]  are  realized  has 
been  discussed  in  6.4  and  needs  no  further  comment. 

6.6.7.  One  basic  question  which  must  be  decided  before  a 
computer  is  built  is  whether  the  machine  is  to  have  a  so-called 
floating  binary  (or  decimal)  point.  While  a  floating  binary  point 
is  undoubtedly  very  convenient  in  coding  problems,  building  it 
into  the  computer  adds  greatly  to  its  complexity  and  hence  a 
choice  in  this  matter  should  receive  very  careful  attention.  How- 
ever, it  should  first  be  noted  that  the  alternatives  ordinarily  con- 
sidered (building  a  machine  with  a  floating  binary  point  vs.  doing 
all  computation  with  a  fixed  binary  point)  are  not  exhaustive  and 
hence  that  the  argiuiients  generally  advanced  for  the  floating 
binary  point  are  only  of  limited  validity.  Such  arguments  overlook 
the  fact  that  the  choice  with  respect  to  any  particular  operation 
(except  for  certain  basic  ones)  is  not  between  building  it  into  the 
computer  and  not  using  it  at  all,  but  rather  between  building  it 
into  the  computer  and  programming  it  out  of  operations  built  into 
the  computer.  (One  short  reference  to  the  floating  binarv  point 
was  made  in  5.13.) 

Building  a  floating  binary  point  into  the  computer  will  not  only 
complicate  the  control  but  will  also  increase  the  length  of  a  num- 
ber and  hence  increase  the  size  of  the  memory  and  the  arithmetic 
unit.  Every  number  is  effectively  increased  in  size,  even  though 
the  floating  binary  point  is  not  needed  in  many  instances.  Further- 
more, there  is  considerable  redundancy  in  a  floating  binary  point 
type  of  notation,  for  each  number  carries  with  it  a  scale  factor, 
while  generally  speaking  a  single  scale  factor  will  suffice  for  a 
possibly  extensive  set  of  numbers.  By  means  of  the  operations 
already  described  in  the  report  a  floating  binary  point  can  be 
programmed.  While  additional  memorv  capacity  is  needed  for  this, 
it  is  probably  less  than  that  required  by  a  built-in  floating  binary 


point  since  a  different  scale  factor  does  not  need  to  be  remembered 
for  each  number. 

To  program  a  floating  binary  point  involves  detecting  where 
the  first  zero  occurs  in  a  number  in  Ac.  Since  Ac  has  shifting 
facilities  this  can  best  be  done  by  means  of  them.  In  terms  of  the 
operations  previously  described  this  would  require  taking  the  given 
number  out  of  Ac  and  performing  a  suitable  arithmetical  operation 
on  it:  For  a  (multiple)  right  shift  a  multiplication,  for  a  (multiple) 
left  shift  either  one  division,  or  as  many  doublings  (i.e.  additions) 
as  the  shift  has  stages.  However,  these  operations  are  inconvenient 
and  time-consuming,  so  we  propose  to  introduce  two  operations 
(L  and  R)  in  order  that  this  (i.e.  the  single  left  and  right  shift) 
can  be  accomplished  directly.  These  operations  make  use  of  facili- 
ties already  present  in  Ac  and  hence  add  very  little  equipment 
to  the  computer.  It  should  be  noted  that  in  many  instances  a  single 
use  of  L  and  possibly  of  R  will  suffice  in  programming  a  floating 
binary  point.  For  if  the  two  factors  in  a  multiplication  have  no 
superfluous  zeros,  the  product  will  have  at  most  one  superfluous 
zero  (if  y,  ^  X  <  1  and  ^  Y  <  I,  then  %  ^  XY  <  1).  This  is 
similarly  true  in  division  (if  %  g  X  <  ^/j  and  y,  ^  Y  <  I,  then 
Yi  <  X/Y  <  I).  In  addition  and  subtraction  any  numbers  growing 
out  of  range  can  be  treated  similarly.  Numbers  which  decrease 
in  these  cases,  i.e.  develop  a  sequence  of  zeros  at  the  beginning, 
are  really  (mathematically)  losing  precision.  Hence  it  is  perfectly 
proper  to  omit  formal  readjustments  in  this  event.  (Indeed,  such 
a  true  loss  of  precision  cannot  be  obviated  by  any  fonnal  proce- 
dure, but,  if  at  all,  only  by  a  different  mathematical  formulation 
of  the  problem.) 

6.7.  Table  1  shows  that  many  of  the  operations  which  the 
control  is  to  execute  have  common  elements.  Thus  addition,  sub- 
traction, multiplication  and  division  all  involve  transferring  a 
number  from  the  Selectrons  to  SR.  Hence  the  control  may  be 
simplified  by  breaking  some  of  the  operations  up  into  more  basic 
ones.  A  timing  circuit  will  be  provided  for  each  basic  operation, 
and  one  or  more  such  circuits  will  be  involved  in  the  execution 
of  an  order.  The  exact  choice  of  basic  operations  will  depend  upon 
how  the  arithmetic  unit  is  built. 

In  addition  to  the  timing  circuits  needed  for  executing  the 
orders  of  Table  I,  two  such  circuits  are  needed  for  the  automatic 
operations  of  transferring  orders  from  the  Selectron  register  to  CR 
and  FR,  and  for  transferring  an  order  from  CR  to  FR.  In  normal 
computer  operation  these  two  circuits  are  used  alternately,  so  a 
binary  counter  is  needed  to  remember  which  is  to  be  used  next. 
In  the  operations  Cu'  S(x)  and  Cc—>  S{x)  the  first  order  of  a  pair 
is  ignored,  so  the  binary  counter  must  be  altered  accordingly. 

The  execution  of  a  sequence  of  orders  involves  using  the  various 


Chapter  4  j  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  117 


Table  1 


Symbolization 


0peratioti 

1 

S(,r)  ^  A(  + 

X 

Clear  accumulator  and  add  number  located  at  position  x  in  the  Selectrons  into  it. 

2 

S(x)  ^  Ac- 

X  — 

Clear  accumulator  and  subtract  number  located  at  position  x  in  the  Selectrons  into  it. 

3 

S(.v)  ^  ArM 

iM 

Clear  accumulator  and  add  absolute  value  of  number  located  at  position  x  in  the  Selectrons 
into  it. 

4 

S(,v)  ^  Ac  -  M 

X  -  M 

Clear  accumulator  and  subtract  absolute  value  of  number  located  at  position  x  in  the  Selec- 
trons into  it. 

5 

S(.v)  ^  A/n- 

v/i 

Add  number  located  at  position  x  in  the  Selectrons  into  the  accumulator. 

6 

S(.v)-.  A/i- 

xh  - 

Subtract  number  located  at  position  x  in  the  Selectrons  into  the  accumulator. 

7 

S(x)  A/iM 

xliM 

Add  absolute  value  of  number  located  at  position  x  in  the  Selectrons  into  the  accumulator. 

8 

S(x)  ^  A/i  -  M 

X  -  hM 

Subtract  absolute  value  of  number  located  at  position  x  in  the  Selectrons  into  the  accumulator. 

9 

S(x)  R 

xR 

Clear  registert  and  add  number  located  at  position  x  in  the  Selectrons  into  it. 

10 

R  ^  A 

A 

Clear  accumulator  and  shift  number  held  in  register  into  it. 

1 1 

S(.v)  X  R  ^  A 

xX 

Clear  accumulator  and  multiply  the  number  located  at  position  x  in  the  Selectrons  by  the  num- 
ber in  the  register,  placing  the  left-hand  39  digits  of  the  answer  in  the  accumulator  and  the 
right-hand  39  digits  of  the  answer  in  the  register. 

12 

A     S(  v)  ^  R 

Clear  register  and  divide  the  number  in  the  accumulator  by  the  number  located  in  position  x 
of  the  Selectrons.  leaving  the  remainder  in  the  accumulator  and  placing  the  quotient  in  the 
register. 

13 

Cii  ^  S(x) 

xC 

Shift  the  control  to  the  left-hand  order  of  the  order  pair  located  at  position  x  in  the  Selectrons. 

14 

Ci/'  ^  S(v) 

vC 

Shift  the  control  to  the  right-hand  order  of  the  order  pair  located  at  position  x  in  the  Selectrons. 

15 

Cr  ^  S(.v) 

xCc 

If  the  number  in  the  accumulator  is  S  0,  shift  the  control  as  in  Cu  — >  S(x). 

16 

Cr'  -.  S(x) 

xCc' 

If  the  number  in  the  accumulator  is  §  0,  shift  the  control  as  in  Cu'  — >  S(i). 

17 

Ar  ^  S(v) 

xS 

Transfer  the  number  in  the  accumulator  to  position  x  in  the  Selectrons. 

18 

A;j  ^  S(x) 

xSp 

Replace  the  left-hand  12  digits  of  the  left-hand  order  located  at  position  i  in  the  Selectrons  by 
the  left-hand  12  digits  in  the  accumulator. 

19 

A,/  ^  S(x) 

xSp' 

Replace  the  left-hand  12  digits  of  the  right-hand  order  located  at  position  x  in  the  Selectrons 
by  the  left-hand  12  digits  in  the  accumulator. 

20 

L 

L 

Multiply  the  number  in  the  accumulator  by  2,  leaving  it  there. 

21 

R 

Divide  the  number  in  the  accumulator  by  2,  leaving  it  there. 

t  Register  means  arithmetic  register. 


timing  circuits  in  secjuence.  When  a  given  timing  circuit  has 
completed  its  operation,  it  emits  a  pulse  which  should  go  to  the 
timing  circuit  to  be  used  ne.xt.  Since  this  depends  upon  the  partic- 
ular operation  being  executed,  these  pulses  are  routed  according 
to  the  signals  received  from  the  decoding  and  receding  function 
tables  activated  bv  the  si.x  binary  digits  specifying  an  order. 

6.8.  In  this  section  we  will  consider  what  must  be  added  to 
the  control  so  that  it  can  direct  the  mechanisms  for  getting  data 
into  and  out  of  the  computer  and  also  describe  the  mechanisms 
themselves.  Three  different  kinds  of  input-output  mechanisms  are 
planned. 

First:  Several  magnetic  wire  storage  units  operated  by  servo- 
mechanisms  controlled  bv  the  computer. 


Second:  Some  viewing  tubes  for  graphical  portrayal  of  results. 

Third:  \  typewriter  for  feeding  data  directlv  into  the  com- 
puter, not  to  be  confused  with  the  equipment  used  for  preparing 
and  printing  from  magnetic  wires.  .\s  presently  planned  the  latter 
will  consist  of  modified  Teletypewriter  equipment,  cf.  6.8.2  and 
6.8.4. 

6.8.1.  Since  there  already  exists  a  way  of  transferring  numbers 
between  the  Selectrons  and  Ac.  therefore  .\c  ma\  be  used  for 
transferring  numbers  from  and  to  a  wire.  The  latter  transfer  will 
be  done  seriallv  and  will  make  use  of  the  shifting  facilities  of  -\c. 
Using  Ac  for  this  purpose  eliminates  the  possibilitv  of  computing 
and  reading  from  or  writing  on  the  wires  simultaneously.  However, 
simultaneous  operation  of  the  computer  and  the  input-output 


118  Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


organ  requires  additional  temporary  storage  and  introduces  a  syn- 
chronizing problem,  and  hence  it  is  not  being  considered  for  the 
first  model. 

Since,  at  the  beginning  of  the  problem,  the  computer  is  empty, 
facilities  must  be  built  into  the  control  for  reading  a  set  of  numbers 
from  a  wire  when  the  operator  presses  a  manual  switch.  As  each 
number  is  read  from  a  wire  into  Ac,  the  control  must  transfer  it 
to  its  proper  location  in  the  Selectrons.  The  CC  may  be  used  to 
count  off  these  positions  in  sequence,  since  it  is  capable  of  trans- 
mitting its  contents  to  FR.  A  detection  circuit  on  CC  will  stop 
the  process  when  the  specified  number  of  numbers  has  been  placed 
in  the  memory,  and  the  control  will  then  be  shifted  to  the  orders 
located  in  the  first  position  of  the  Selectron  memory. 

It  has  already  been  stated  that  the  entire  memory  facilities  of 
the  wires  should  be  available  to  the  computer  without  human 
intervention.  This  means  that  the  control  must  be  able  to  select 
the  proper  set  of  numbers  from  those  going  by.  Hence  additional 
orders  are  required  for  the  code.  Here,  as  before,  we  are  faced 
with  two  alternatives.  We  can  make  the  control  capable  of  exe- 
cuting an  order  of  the  form:  Take  numbers  from  positions  p  to 
p  -I-  s  on  wire  No.  k  and  place  them  in  Selectron  locations  v  to 
V  +  s.  Or  we  can  make  the  control  capable  of  executing  some  less 
complicated  operations  which,  together  with  the  already  given 
control  orders,  are  sufficient  for  programming  the  transfer  opera- 
tion of  the  first  alternative.  Since  the  latter  scheme  is  simpler  we 
adopt  it  tentatively. 

The  computer  must  have  some  way  of  finding  a  particular 
number  on  a  wire.  One  method  of  arranging  for  this  is  to  have 
each  number  carry  with  it  its  own  location  designation.  A  method 
more  economical  of  wire  memory  capacity  is  to  use  the  Selectron 
memory  facilities  to  remember  the  position  of  each  wire.  For 
example,  the  computer  would  hold  the  number  <j  specifying  which 
number  on  the  wire  is  in  position  to  be  read.  If  the  control  is 
instmcted  to  read  the  number  at  position  pj  on  this  wire,  it  will 
compare  pj  with  fj;  and  if  they  differ,  cause  the  wire  to  move 
in  the  proper  direction.  As  each  number  on  the  wire  passes  by, 
one  unit  is  added  or  subtracted  to  f  j  and  the  comparison  repeated. 
When  Pj  =  fj  numbers  will  be  transferred  from  the  wire  to  the 
accumulator  and  then  to  the  proper  location  in  the  memory.  Then 
both  ij  and  pj  will  be  increased  by  1,  and  the  transfer  from  the 
wire  to  accumulator  to  memory  repeated.  This  will  be  iterated, 
until  +  s  and  pj  -(-  s  are  reached,  at  which  time  the  control 
will  direct  the  wire  to  stop. 

Under  this  system  the  control  must  be  able  to  execute  the 
following  orders  with  regard  to  each  wire:  Start  the  wire  forward, 
start  the  wire  in  reverse,  stop  the  wire,  transfer  from  wire  to  Ac, 


and  transfer  from  Ac  to  wire.  In  addition,  the  wire  must  signal 
the  control  as  each  digit  is  read  and  when  the  end  of  a  number 
has  been  reached.  Conversely,  when  recording  is  done  the  control 
must  have  a  means  of  timing  the  signals  sent  from  Ac  to  the  wire, 
and  of  counting  off  the  digits.  The  2^  counter  used  for  multiplica- 
tion and  division  may  be  used  for  the  latter  purpose,  but  other 
timing  circuits  will  be  required  for  the  former. 

If  the  method  of  checking  by  means  of  two  computers  operating 
simultaneously  is  adopted,  and  each  machine  is  built  so  that  it 
can  operate  independently  of  the  other,  then  each  will  have  a 
separate  input-output  mechanism.  The  process  of  making  wires 
for  the  computer  must  then  be  duplicated,  and  in  this  way  the 
work  of  the  person  making  a  wire  can  be  checked.  Since  the  wire 
servomechanisms  cannot  be  synchronized  by  the  central  clock,  a 
problem  of  synchronizing  the  two  computers  when  the  wires  are 
being  used  arises.  It  is  probably  not  practical  to  synchronize  the 
wire  feeds  to  within  a  given  digit,  but  this  is  unnecessary  since 
the  numbers  coming  into  the  two  organs  Ac  need  not  be  checked 
as  the  individual  digits  arrive,  but  onlv  prior  to  being  deposited 
in  the  Selectron  memory. 

6.8.2.  Since  the  computer  operates  in  the  binary  system,  some 
means  of  decimal-binary  and  binary-decimal  conversions  is  highly 
desirable.  Various  alternative  ways  of  handling  this  problem  have 
been  considered.  In  general  we  recognize  two  broad  classes  of 
solutions  to  this  problem. 

First:  The  conversion  problems  can  be  regarded  as  simple  arith- 
metic processes  and  programmed  as  sub-routines  out  of  the  orders 
already  incorporated  in  the  machine.  The  details  of  these  programs 
together  with  a  more  complete  discussion  are  given  fully  in  Chap- 
ter 9,  Part  II,  where  it  is  shown,  among  other  things,  that  the 
conversion  of  a  word  takes  about  5  msec.  Thus  the  conversion  time 
is  comparable  to  the  reading  or  withdrawing  time  for  a  word — 
about  2  msec — and  is  trivial  as  compared  to  the  solution  time  for 
problems  to  be  handled  by  the  computer.  It  should  be  noted  that 
the  treatment  proposed  there  presupposes  only  that  the  decimal 
data  presented  to  or  received  from  the  computer  are  in  tetrads, 
each  tetrad  being  the  binary  coding  of  a  decimal  digit — the  infor- 
mation (precision)  represented  by  a  decimal  digit  being  actually 
equivalent  to  that  represented  by  3.3  binary  digits.  The  coding 
of  decimal  digits  into  tetrads  of  binary  digits  and  the  printing  of 
decimal  digits  from  such  tetrads  can  be  accomplished  quite  simply 
and  automatically  by  slightly  modified  Teletype  equipment,  cf. 
6.8.4  below. 

Second:  The  conversion  problems  can  be  regarded  as  unique 
problems  and  handled  by  separate  conversion  equipment  incor- 
porated either  in  the  computer  proper  or  associated  with  the 


Chapter  4  |  Preliminary  discussion  of  the  logical  design  of  an  electronic  computing  instrument  119 


mechanisms  for  preparing  and  printing  from  magnetic  wires.  Such 
converters  are  really  nothing  other  than  special  purpose  digital 
computers.  They  would  seem  to  be  justified  onlv  for  those  com- 
puters which  are  primarily  intended  for  solving  problems  in  which 
the  computation  time  is  small  compared  to  the  input-output  time, 
to  which  class  our  computer  does  not  belong. 

6. 8. .3.  It  is  possible  to  use  various  types  of  cathode  ray  tubes, 
and  in  particular  Selectrons  for  the  viewing  tubes,  in  which  case 
programming  the  viewing  operation  is  quite  simple.  The  viewing 
Selectrons  can  be  switched  by  the  same  fimction  tables  that  switch 
the  memorv  Selectrons.  Bv  means  of  the  substitution  operation 
Ap  —>  S(.r)  and  Ap'  — >  S(.r),  six-digit  numbers  specifying  the  abscissa 
and  ordinate  of  the  point  (six  binary  digits  represent  a  precision 
of  one  part  in  2®  =  64,  i.e.  of  about  1.5  per  cent  which  seems 
reasonable  in  such  a  component)  can  be  substituted  in  this  order, 
which  will  specify  that  a  particular  one  of  the  viewing  Selectrons 
is  to  be  activated. 

6.8.4.  .\s  was  mentioned  above,  the  mechanisms  used  for 
preparing  and  printing  from  wire  for  the  first  model,  at  least,  will 
be  modified  Teletype  etjuipment.  \\'e  are  quite  fortunate  in  having 
secured  the  full  cooperation  of  the  Ordnance  Development  Divi- 
sion of  the  National  Bureau  of  Standards  in  making  these  modifi- 
cations and  in  designing  and  building  some  associated  equipment. 

By  means  of  this  modified  Teletype  equipment  an  operator  first 
prepares  a  checked  paper  tape  and  then  directs  the  equipment 
to  transfer  the  information  from  the  paper  tape  to  the  magnetic 
wire.  Similarly  a  magnetic  wire  can  transfer  its  contents  to  a  paper 


tape  which  can  be  used  to  operate  a  teletypewriter.  (Studies  are 
being  undertaken  to  design  equipment  that  will  eliminate  the 
necessity  for  using  paper  tapes.) 

.\s  was  shown  in  6.6..5,  the  statement  of  a  new  problem  on  a 
wire  involves  data  unique  to  that  problem  interspersed  with  data 
found  on  previously  prepared  paper  tapes  or  magnetic  wires.  The 
ecjuipment  discussed  in  the  previous  paragraph  makes  it  possible 
for  the  operator  to  combine  conveniently  these  data  on  to  a  single 
magnetic  wire  ready  for  insertion  into  the  computer. 

It  is  frequently  very  convenient  to  introduce  data  into  a  com- 
putation without  producing  a  new  wire.  Hence  it  is  planned  to 
build  one  simple  typewriter  as  an  integral  part  of  the  computer. 
By  means  of  this  typewriter  the  operator  can  stop  the  computation, 
type  in  a  memory  location  (which  will  go  to  the  FR),  type  in  a 
number  (which  will  go  to  Ac  and  then  be  placed  in  the  first 
mentioned  location),  and  start  the  computation  again. 

6.8.5.  There  is  one  further  order  that  the  control  needs  to 
execute.  There  should  be  some  means  by  which  the  computer  can 
signal  to  the  operator  when  a  computation  has  been  concluded, 
or  when  the  computation  has  reached  a  previously  determined 
point.  Hence  an  order  is  needed  which  will  tell  the  computer  to 
stop  and  to  flash  a  light  or  ring  a  bell. 


References 

Burk.\62<i,  BiirkA62fc;  CrawP??;  GoldHaSa,  h.  c.  d-  RajcJ43. 


Chapter  5 

The  DEC  PDP-8 

Introduction^ 

The  PDP-8  is  a  single-address,  12-bit-word  computer  of  the  second 
generation.  It  is  designed  for  task  environments  with  minimum 
arithmetic  computing  and  small  Mp  requirements.  For  example, 
it  can  be  used  to  control  laboratory  devices,  such  as  gas  chromoto- 
graphs  or  sampling  oscilloscopes.  Together  with  special  T's,  it  is 
programmed  to  be  a  laboratory  instrument,  such  as  a  pulse  height 
analyzer  or  a  spectrum  analyzer.  These  applications  are  typical 
of  the  laboratory  and  process  control  requirements  for  which  the 
machine  was  designed.  As  another  example,  it  can  serve  as  a 
message  concentrator  by  controlling  telephone  lines  to  which 
typewriters  and  Teletypes  are  attached.  The  computer  oceasion- 
allv  stands  alone  as  a  small-scale  general-purpose  computer.  Most 
recently  it  was  introduced  as  a  small-scale  general-purpose  time- 
sharing system,  based  on  work  at  Carnegie-Mellon  University  and 
DEC.  It  is  used  as  a  KT(display)  when  it  has  a  F(display;  '.338); 
this  C  is  discussed  in  Chap.  2.5.  The  PDP-8  has  achieved  a  produc- 
tion status  formerly  reserved  for  IBM  computers;  about  .5,000  have 
been  constructed. 

PDP-8  differs  from  the  character-oriented  8-bit  computer  in 
Chap.  10;  it  is  not  unlike  the  16-bit  computers,  such  as  the  IBM 
1800  in  Chap.  .33.  The  PDP-8  is  typical  of  several  12-bit  computers: 
the  early  CDC-160  series  (1960),  CDC-6600  Peripheral  and  Con- 
trol Processor  (Chap.  39),  the  SDS-92,  M.I.T.  Lincoln  Laboratory's 
Laboratory  Instrument  Computer  LINC  (1963),  Washington  Uni- 
versity's Programmed  Console  (1967),  and  the  SCC  650  (1966). 

The  PDP-5  (transistor,  1963),  PDP-8  (1965),  PDP-8/S  (serial, 
1966)  and  PDP-8/I  (integrated  circuit,  1968),  PDP-8/L  (integrated 
circuit,  1968)  constitute  a  series  of  computers  based  on  evolving 
technology.  All  of  these  have  identical  ISP's.  Their  PMS  structures 
are  nearly  identical,  and  all  components  other  than  Pc  and  Mp 
are  compatible  throughout  the  series.  The  LINC-8-.3.38  PMS  struc- 
ture is  presented  in  Fig.  1.  A  cost  performance  tradeoff  took  place 
in  the  PDP-8  (parallel-by-word  arithmetic)  and  PDP-8/S  (serial- 
by-bit  arithmetic)  implementations.  A  PDP-8/S  is  one-fifteenth  of 
a  PDP-8  at  one-half  the  cost.  The  performance  factors  can  be 
attributed  to  8/1.5  or  5.3  for  Mp  speed  and  a  factor  of  about  3 
for  logical  organization,  even  though  the  same  2-niegahertz  logic 
clock  is  used  in  both  cases.  The  PDP-8  is  about  6.7  times  a  PDP-5. 

'The  initials  in  the  title  stand  for  Digital  Equipment  Corporation  Pro- 
grammed Data  Processor. 


The  ISP  of  the  PDP-8  Pc  is  about  the  most  trivial  in  the  book. 
It  has  only  a  few  data  operators,  namely,  <—,-!-,  —  (negate),  — |, 
A,  /  2,  X  2,  (optional)  X,  /,  and  normalize.  It  operates  on  words. 
Integers,  and  boolean  vectors.  However,  there  are  microcoded 
instructions,  which  allow  compound  instructions  to  be  formed  in 
a  single  instruction. 

The  computer  is  straightforward  and  illustrates  the  levels  dis- 
cussed in  Chap.  1.  We  can  easily  look  at  it  from  the  "top  down." 
The  C  in  PMS  notation  is 

C('PDP-8;  technology:transistors;  12  b/w; 

descendants:'PDP-8/S,  'PDP-8/I,  'PDP-8/L; 
antecedents:  'PDP-5; 
Mp(core;  #0:7;  4096  w;  tc:1.5  jiis/w); 
Pc(Mps(2  ~  4  w); 

instruction  length:l|2  w 

address  /  instruction :  1 ; 

operations  on  data/od;(<— ,  -f ,  — |,  A,  —(negate),  X  2, 
/2,  +1) 

optional  operations:(X,  /,  normalize); 

data-types;word,  integer,  boolean  vector; 

operations  for  data  access:4); 
P(display;  '.338); 
P(c;  'LINC); 
S('I/0  Bus;  1  Pc;  64  K); 
Ms(disk,  'DECtape,  magnetic  tape); 
T(paper  tape,  card,  analog,  cathode-ray  tube)) 

ISP 

The  ISP  is  presented  in  Appendix  1  of  this  chapter  (including  the 
optional  Extended  Arithmetic  Element/EAE).  The  2i--word  Mp 
is  divided  into  32  fixed-length  pages  of  128  words  each.  Address 
calculation  is  based  on  references  to  the  first  page,  Page„0,  or  to 
the  current  page  of  the  Program  Counter/PC.  The  effective- 
address  calculation  procedure  provides  for  both  direct  and  indirect 
reference  to  either  the  current  page  or  the  first  page.  This  scheme 
allows  a  7-bit  address  to  specify  local  page  addresses. 

A  2^^-word  Mp  is  available  on  the  PDP-8,  but  addressing 
greater  than  2^^  words  is  comparatively  inefficient.  In  the  extended 
range,  two  3-bit  registers,  the  Program  Field  and  Data  Field 
Registers,  select  which  of  the  eight  2''-word  blocks  are  being 
actively  addressed  as  program  and  data. 

There  is  an  array  of  eight  registers,  called  the  Auto„index 
registers,  which  resides  in  Page„0.  This  array  (Auto_index[0: 
iTl]<0;7>:=  M[10g:17g]<0:ll>)  possesses  the  useful  property  that 
whenever  an  indirect  reference  is  made  to  it,  a  1  is  first  added 


Chapter  5  |  The  DEC  PDP  8  121 


' Data  Break ; 

f'irect 

'DM0)  Data 
Mu I t I p I exor ; 
rad  p  a  1 : 
from:  7  P.K; 
to:  Mp 


- T . consol e 
-T  CTe  I  e  t  yp 

I  paper  tape;  (reade 
100  char/s) :  8  b/char 
incremental  point  plot 


-Tp  ncrem 
Lin/poi 


10  char/s;  8  b/char;        char) - 
300  char/s)! (punch; 

300  point/s; 


nt 


■  Kcard;  reader:  200lR00  card/mm)' 
-T(card;   punch;    100  card/min)— 

ine;  printer;   300  line/min;   120  col/li 
[6^  char/col 

CRT;  display;  area:   10  x  10  in^ls  X  5 
,30  ^s/point;   .01  1.005  in/point 

K  Tdight;  pen).-' 

K   KDataphone;   1  .  2  ~  li . 8  kb/s ) - 

K(/l'l  :  1  0) — L  (ana  log  ;  output  ;  0  ~  -  I  0  vol  ts) 

K  S  L(#0:63;  analog;   input;  0  --10  volts).- 

-K  S  K(«0:63;  Teletype;  110,  l80b/s)- 

iK        S        MsN'O;?;  "DEC-tape:  addressable  magnetic  tape; 

Ll33  us/w;   length:  260  ft;  350  char/in;   3  b/char 

:K  S        MsT/^O;?;  magnetic  tape;  36       |75  1 1  1  2  .  5  in/s" 

L200.556,800  b/in;  6  l8  b/char 

■  S  Ms[TO:  3;  fixed  head  disk;  t  delay:  0  ~  17  ms ; 

|(66   os/w;   32768  w)|(l6   us/w;   26211.'.  w); 


--P(display  '338)- 


.'12, 
-T(f0:3; 

T(«0:3; 
T(/'0:3; 


area:  10 


parity)  b/w 
CRT;  display: 

light:  pen) 

push  buttons;  console).- 


')- 


'  Laboratory 
I nstrument 
Computer/L 1 NC 


sp'O:  I  ;  L  I  NC, 
L6.25  kw/s: 


,tape 
,17  , 


addressable  magnetic  tape; 


T(/^0:15;  knobs,  analog;  input).— 
T(CRT;  display;   5x5  in^}-^ 
T(digital;   input,  output)- 
-TCData  Terminal  Panel:  digital; 


i  nput  ,  output ) - 


'Mp(core;   1.5  us/w;   1.096  w:    (12  +  l)b) 
^ S ( ' Memory  Bus ) 

^Pc(l   ^1  w/i  nst  ruct  i  on :  data:  w,   i,bv:   12  b/w 

antecedents:   PDP-5;  descendants;  PDP-8S,  PDP-81,  PDP-L) 
■'SCI/O  Bus;   from:   Pc;   to;  64  K) 
■K'l~  h   instructions;  M.bufferCl  char~2w)) 


lony:  transistors; 


Fig.  1.  DEC  LINC-8-338  PMS  diagram. 


122  Part  2     The  instruction-set  processor:  main  line  computers 


Section  1      Processors  with  one  address  per  instruction 


to  its  contents.  (That  is,  there  is  a  side  efifect  to  referencing.)  Thus, 
address  integers  in  the  register  can  select  the  next  member  of  a 
vector  or  string  for  accessing. 

The  instruction-set-execution  definition  can  also  be  presented 
as  a  decoding  diagram  or  tree  (Fig.  2).  Here,  each  block  represents 
an  encoding  of  bits  in  the  instniction  word.  A  decoding  diagram 
allows  one  more  descriptive  dimension  than  the  conventional, 
linear  ISP  description,  revealing  the  assignment  of  bits  to  the 
instniction.  Figure  2  still  requires  ISP  descriptions  for  Mp,  Mps. 
the  instniction  execution,  the  effective-address  calculation,  and 
the  interpreter.  Diagrams  such  as  Fig.  2  are  useful  in  the  ISP 


design  to  determine  which  instruction  numbers  are  to  be  assigned 
to  names  and  operations  and  instructions  which  are  free  to  be 
assigned  (or  encoded). 

There  are  eight  basic  instructions  encoded  bv  .3  bits,  that  is 
op<0;2>  :=  i<0:2>,  where  instniction/i<0:ll>.  Each  of  the  first  six 
instnictions  (where  0  <  op  <  6)  have  the  4  address  operand  deter- 
mination modes  (thus  yielding  essentially  24  instructions).  The  first 
six  instnictions  are: 

data  transmission:       deposit  and  clear-accumulator/dca 

two's  complement  add  to  the  accumula- 
tor/tad 


Fig.  2.  DEC  PDP-8  instruction-decoding  diagram. 


Chapter  5     The  DEC  PDP-8  123 


binary  arithmetic:       two's  complement  add  to  the  accumu- 
lator/tad 

binary  boolean:  and  to  the  accumulator/and 

program  control:         jimip/set  program  counter/jnip 
jump  to  subroiitine/jms 
index  memory  and  skip  if  results  are 
zero/isz 

Note  that  the  add  instruction,  tad,  is  used  for  both  data  trans- 
mission and  arithmetic. 

The  subroutine-calling  instruction,  jms,  provides  a  method  for 
transferring  a  link  to  the  beginning  (or  head)  of  the  subroutine. 
In  this  way  arguments  can  be  accessed  indirectly,  and  a  return 
is  executed  by  a  jump  indirect  instruction  to  the  location  storing 
the  returned  address.  This  straightforward  subroutine-call  mecha- 
nism, although  inexpensive  to  implement,  requires  reentrant  and 
recursive  subroutine  calls  to  be  interpreted  by  software,  rather 
than  by  hardware.  \  stack,  as  in  the  DEC  .338  (Chap.  25),  would 
be  nicer. 

The  input„output  instruction  iot  (:=op  =  6)  uses  the  re- 
maining 9  bits  of  the  instruction  to  specify  instructions  to  input/ 
output  devices.  The  6  io^select  bits  select  1  of  64  devices.  The 
3  bits,  io„pl„bit,  io„p2„bit,  io_p4„bit,  command  the  .selected 
device  bv  conditionalh'  providing  three  pulses  in  sequence.  The 
instructions  to  a  tvpical  io  device  are: 

io„pl_bit  ->  (IO„skip„flag[io  select]      (PC  «-  PC  1)) 
testing  a  condition  of  an  IO  device  output  to  a  device  input 
from  a  device 

io„p4„bit^  (Output„data[io  select]  <— AC) 

io„p2_bit     (AC      Input_data[io  select]) 

There  are  three  microcoded  instmction  groups  selected  by 
op  =  7.  The  instruction  decoding  diagram  (Fig.  2)  and  the  ISP 
description  (Appendix  1  of  this  chapter)  show  the  microinstruc- 
tions which  can  be  combined  in  a  single  instruction.  These  instruc- 
tions are:  operate  group  I  (:  =  (op  =  7)  A  -|  i<3>)  for  operating  on 
the  processor  state;  operate  group  2  (:=  (op  =  7)  A  (i<3,ll>  — 
lOj))  for  testing  the  processor  state;  and  the  extended  arithmetic 
element  group  (:=  ((op  =  7)  A  (i<3,ll>  =  Uj)))  for  multiply, 
divide,  etc.  Within  each  instruction  the  remaining  bits,  <4:10>  or 
<4:11>,  are  extended  instruction  (or  opcode)  bits;  that  is,  the  bits 
are  microcoded  to  select  instructions.  In  this  way  an  instmction 
is  actually  programmed  (or  microcoded).  For  example,  the  instruc- 


tion set„link  — >L  «— 1  is  formed  by  coding  the  two  microinstruc- 
tions, clear  link,  next,  complement  link. 

opr_l      (i<.5>  ^  L  <—  0;  next 
i<7>^L^-,L) 

Thus,  in  operate  group  1,  the  instnictions  clear  link,  complement 
link,  and  set  link  are  formed  by  coding  instruction<.5,7)  =  10,  01, 
and  II,  respectively.  The  operate  group  2  instruction  is  used  for 
testing  the  condition  of  the  Pc  state.  This  instruction  uses  bits  5, 
6,  and  8  to  code  tests  fot  the  accumulator.  The  AC  skip  conditions 
are  coded  (0  -~  7)  as  never,  always,  =0,  ^i^O,  <0,  >0,  <0,  and  >0. 
If  all  the  nonredundant  and  useful  variations  in  the  two  operate 
groups  were  available  as  separate  instnictions  in  the  manner  of 
the  first  seven  (dca,  tad,  etc.),  there  would  be  approximately 
7  +  12(opr„l)  -I-  I0(opr„2)  -|-  6(EAE)  =  .3.5  instnictions  in  the 
PDP-8. 

The  optional  Extended  Arithmetic  Element/EAE  includes 
additional  Multiplier  Quotient/MQ  and  Shift  Counter/SC  regis- 
ters and  provides  the  hardwired  operations  multiply,  divide,  logi- 
cal shift  left,  arithmetic  shift,  and  normalize.  The  EAE  is  defined 
on  the  last  page  of  .\ppendix  1. 

The  interrupt  scheme 

External  conditions  in  the  input/output  devices  can  request  that 
Pc  be  interrupted.  Interrupts  are  allowed  if  (Interrupt^state  =  I). 
A  request  to  interrupt  clears  Interrupt^state  (Interrupt^state 
<—  0),  and  Pc  behaves  as  though  a  jump  to  subroutine  0  instruction, 
jms  0,  had  been  given.  A  special  iot  instruction  (instruction  = 
600Ig)  followed  bv  a  jump  to  subroutine  indirect  to  0  instmction 
(instmction  =  5200,)  returns  Pc  to  the  intermptable  state  with 
Intermpt„state  =  1.  The  program  time  to  save  M(processor 
state/ps)  is  6  Mp  accesses  (9  microseconds),  and  the  time  to  restore 
Mps  is  9  Mp  accesses  (13.5  microseconds). 

Only  one  intermpt  level  is  provided  in  the  hardware.  If  multi- 
ple priority  levels  are  desired,  programmed  polling  is  required. 
Most  io  devices  have  to  intermpt  because  they  do  not  have  a 
program-controlled  enable  switch  for  the  intermpt.  For  multiple 
devices  approximately  3  cycles  (4.5  jxs)  are  required  to  poll  each 
intermpter. 

PMS  structure 

The  PMS  structure  of  the  LINC-8-.338  consisting  of  a  Pc('LINC), 
Pc('PDP-8),  and  P.display('.338)  is  shown  in  Fig.  1.  The  PDP-8  is 
just  a  single  Pc.  The  Pc('LINC)  is  a  very  capable  Pc  with  more 


124   Part  2  j  The  instruction-set  processor:  main  line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


instructions  than  the  main  Pc.  It  is  available  in  the  structure  to 
interpret  programs  written  for  the  C('LINC),  a  computer  devel- 
oped by  M.I.T.  s  Lincoln  Laboratory  as  a  laboratory  instrument 
computer  for  biomedical  and  laboratory  applications.  Because  of 
the  rather  limited  ISP  in  Pc,  one  would  hardly  expect  to  find  all 
the  components  present  in  Fig.  1  in  an  actual  configuration. 

The  S  between  the  Mp  and  the  Pc  allows  eight  Mp's.  This  S 
is  actually  S('Memory  Bus;  8  Mp;  1  Pc;  (P  requests);  time-multi- 
plexed; 1.5  /is/w).  Thus  the  switch  makes  Mp  logically  equivalent 
to  a  single  Mp(32768  w).  There  are  two  other  L's  which  are  con- 
nected to  the  Pc,  excluding  the  T.console.  They  are  L('I/0  Bus) 
and  L('Data  Break;  Direct  Memory  Access).  These  links  become 
switches  when  we  consider  the  physical  structure.  Associated  with 
each  device  is  a  switch,  and  the  bus  links  all  the  devices;  the 
L('I/0  Bus)  is  really  an  S('I/0  Bus).  Each  time  a  K  connects  to 
it,  the  S  is  included  in  the  K.  A  simplified  PMS  diagram  (Fig.  3) 
shows  the  stnicture  and  the  logical-physical  transformation.  Thus, 
the  I/O  Bus  is 

S('I/0  Bus;  duplex;  bus;  time-multiplexed,  1  Pc;  64  K;  Pc 
controlled,  K  requests;  t.4.5  fis/w) 

The  S('I/0  Bus)  is  the  same  for  the  PDP-5,  8,  8/S,  8/1,  and  8/L. 
Hence,  any  K  can  be  used  on  any  of  the  above  C"s.  The  I/O  Bus 
is  the  link  to  the  K's  for  Pc-controlled  data  transfers.  Each  word 
transferred  is  designated  by  a  Pc  instruction.  However,  the  I/O 
Bus  allows  a  K  to  request  Pc's  attention  via  the  interrupt  request 
signal.  The  Pc  polls  the  K's  to  find  the  requesting  K  if  multiple 
internipt  requests  occur.  A  detailed  structure  of  the  Pc-Mp 
(Fig.  4)  shows  these  L('I/0  Bus,  'Data  Break)  connections  to  the 
registers  and  control  in  the  notation  used  by  DEC.  This  diagram 
is  essentially  a  fimctional  block  diagram. 

The  S('I/0  Bus)  in  Fig.  1  is  only  an  abstract  representation  of 


T.console  - 
I 

Mp(#0;  core)— S— ,  L  Pc  L('l/0  Bus)  p-,S-K- 


Mpl/Zl  ) — S  , 
Mp(#7)— 5  . 


,  L  ( ' Data  Break)  L 
I 


S { ' Memory  Bus ) 


Fig.  3.  DEC  PDP-8  PMS  diagram  (simplified). 


the  structure.  Since  it  is  a  bus  structure,  the  S  can  be  expanded 
into  L's  and  simple  S's  as  shown  in  Fig.  3.  The  termination  of  the 
L  in  Pc  is  given  in  Fig.  3.  The  corresponding  logic  at  a  K  is  given 
in  Fig.  5  in  terms  of  logic  design  elements  (AND's  and  OR's). 
(Fig.  5  also  shows  the  S('I/0  Bus)  structure  of  Figs.  1  and  3).  The 
operation  of  S('I/0  Bus)  shown  in  Fig.  5  starts  when  Pc  sends 
a  signal  to  select  (or  address)  a  particular  K,  using  the  lO^select 
<0:5)  signals  to  form  a  6-bit  code  to  which  K  responds.  Each 
K  is  hardwired  to  respond  to  a  unique  code.  The  local  control, 
K[j],  select  signal  is  then  used  to  form  three  local  commands  when 
ANDed  with  tlie  three  iot  command  lines  from  Pc,  io„pl„bit, 
io„p2„bit,  and  io„p4^bit.  Twelve  data  bits  are  transmitted  either 
to  or  from  Pc,  indirectly  under  K's  control.  This  is  accomplished 
by  using  the  AND-OR  gates  in  K  for  data  input  to  Pc.  and  the 
AND  gate  for  data  input  to  K.  The  data  lines  are  connected  to  AC 
as  shown  in  Fig.  4.  A  single  skip  input  is  used  so  that  Pc  can 
test  a  status  bit  in  K.  A  K  communicates  to  Pc  via  the  interrupt 
request  line.  Any  K  wanting  attention  simply  ORs  its  request  signal 
into  the  interrupt  request  signal.  Program  polling  in  Pc  then  selects 
the  specific  interrupter.  Normally,  the  K  signal  causing  an  inter- 
rupt is  also  connected  to  the  skip  input. 

The  L('Data  Break;  Direct  Memory  Access)  provides  a  direct 
access  path  for  a  P  or  K  to  Mp  via  Pc.  The  number  of  access  ports 
to  memory  can  be  e.xpanded  to  eight  by  using  the  S('DM01  Data 
Multiplexer).  The  S  is  requested  from  a  P  or  K.  The  P  or  K  supplies 
an  Mp  address,  a  read  or  write  access  request,  and  then  either 
accepts  or  supplies  data  for  the  Mp  accessed  word.  In  the  config- 
uration (Fig.  1),  P('LINC)  and  P('.3.38)  are  connected  to  S('DMOl) 
and  make  requests  to  Mp  for  both  their  instnictions  and  data  in 
the  same  way  as  the  Pc.  The  global  control  of  these  processor 
programs  is  via  the  S('I/0  Bus).  The  Pc  issues  start  and  stop  com- 
mands, initializes  their  state,  and  examines  their  final  state  when 
a  program  in  the  other  P  halts  or  requires  assistance. 

When  a  K  is  connected  to  L('Data  Break)  or  to  S('DM01  Data 
Multiplexer),  the  K  only  accesses  Mp  for  data.  The  most  complex 
function  these  K's  carry  out  is  the  transfer  of  a  complete  block 
of  data  between  the  Mp  and  an  Ms  or  a  T,  for  example, 
K('DECtape.  disk).  A  special  mode,  the  three-cycle  data  break, 
is  controlled  by  Pc  so  that  a  K  may  request  the  next  word  from 
a  queue  in  Mp.  In  this  mode  the  next  word  is  taken  from  the  queue 
(block)  in  Mp,  and  a  counter  is  reduced  each  time  K  makes  a 
request.  With  this  scheme,  a  word  transfer  takes  three  Mp  cycles: 
one  to  add  one  to  the  block  count,  one  to  add  one  to  the  address 
pointer,  and  one  to  transmit  the  word. 

The  DECtape  was  derived  from  M.I.T.'s  Lincoln  Laboratory 
LINCtape  unit.  Data  are  explicitly  addressed  by  blocks  (variable 


Peripherol 
equipment 
I/O  Bus 


AC 

dato  (12) 


I/O  Bus 
peripherol  ^ 
equipment  O 
using 

programmed  ♦ 
tronsfers  Select 
code 
(MB) 
(6) 


Dato 
switches 
^  12 


Teletype 
model  33 
*  ASR  a 


Teletype 
control 


Peripheral 
equipment 
I/O 


Peripherol 
equipment - 
I/O  Bus 


Output 

bus 
drivers 


Data  (8) 


{ 


AC 
control 


Dato  (12) 


Peripheral 
equipment 
using  the 
Data  Break 
facilities 


Output 
bus 


Dota  (12) 


Increment  MB 


Inhibit  current  oddress  count 


Transfer  direction 


,  Word  count  Overflow 


Memory 
butter 
register 


MB 

control 


Program 
counter 
contro I 


Program 
counter 


4096 -Word 
core 
memory 


Instruction 
register 


Breok  request 


Cycle  select  2 


Break  state 


Mojor 
stote 
generator 


Address  occepfed 


Memory 
oddress 
register 


MA 

control 


^  lOP  1,2,ond  4  pulses  (3) 

lOP  pulse 

generator 

Run 

Run  ond  pouse 

control 

^  Power  cieor  pulses 

Power  clear  pulse 

Special  pulse 

generator 

generator 

^  T1  and  T2    dock  pulses  (2) 

Timing  signal 

generator 

Progrom  interrupt  request 

-O 

Program  interrupt 

Skip  interrogotion  response 

Synchronization 

I/O  skip 

Port  of  ISP 

'  Transfer  direction  is  into  POP -8 
when  -3  volts  ,  out  of  PDP-8 
when  ground 

2  Dato  breok  request  is  for  three- 
cycle  break  when  ground  or  one- 
cycle  break  when  -3  volts 


Flow  direction 
DEC  standord  positive  pulse  (-3  volts  to 
DEC  standord  negative  pulse  (ground  to  ■ 
DEC  stondord  ground  level  signal 
DEC  standard -3  volt  level  signol 


ground  ) 
■3  volt-) 


Fig.  4.  DEC  PDP-8  timing  and  control-element  block  diagram. 

(Courtesy  of  Digital  Equipment  Corporation.) 


126  Part  2  I  The  instruction-set  processor:  main  line  computers 


Section  1     Processors  with  one  address  per  instruction 


-  L(  'AC<0:11> ;  output  ,12  b )- — 

-  Ll'AC^input<0:li>  input,  12  b) 

-  LCIO^skip)  

-  L('IO^interrupt^request)  


-  L(*I0._ipulsei_.p1,p2,p4-, pulse; output)-^ 
I —  L('IO^select<0;5>;output 


-Select  code=100101  =  k 

iuselect.  =  (r0^select  =  k) 


lOt-pulse-pl  ^  kwselect 
(used  for  IO„skip  [k]— PC— PC  +  1) 
I0.^pulseup2  "  k^select 
{used  tor  AC  *-Input„^doto  [k]  ) 
I0^pulse^p4A  kwselect 
ised  tor  output.^doto  [k]— AC) 


Interrupt,j/equest  [k] 

lO^skip  [k] 

Input^dota[k]<Co;i{> 
To  Output^doto  [k]<0:il) 


Actual   Bus  Structure  Logical  Structure 


Fig.  5.  DEC  PDP-8  S('l/0  Bus)  logic  and  PIVIS  diagrams. 

but  by  convention  128  w).  Thus  information  in  a  block  can  be 
replaced  or  rewritten  at  random.  This  operation  is  unlike  queue- 
accessed  tape  (conventional  IBM  format  magnetic  tape)  in  which 
data  can  be  appended  only  to  the  end  of  a  file. 

The  control  for  the  T(telephone)  links  64  Teletypes  or  type- 
writers to  the  Pc.  The  final  K  which  connects  to  a  line  is  on  a 
bit-serial  basis.  Since  a  telephone  line  .sends  and  receives  informa- 


tion serially  by  bit,  there  are  special  input/ output  instructions  in 
the  Pc  to  sample  the  line  and  to  convert  the  sampled  bits  to  coded 
characters.  There  are  11  bits  transmitted  per  character  (although 
other  codings  use  7,  7.42,  7.5,  and  10  bits  per  character).  Of  the 
11  bits,  there  are  3  control,  1  parity,  and  7  information  bits.  The 
action  of  the  Pc  instruction,  which  is  issued  5x11  (55)  times  for 
every  character,  is  to  control  the  line  by  forming  the  7-bit  charac- 
ters. The  instruction  is  a  good  example  of  tradeoff  in  the  hard- 
ware/software domain  toward  almost  pure  software;  the  only 
hardware  state  associated  with  a  telephone  line  is  a  1-bit  register 
to  hold  the  state  of  the  outgoing  line,  and  a  single  AND  gate  to 
sample  the  incoming  line  state.  This  sampling  process  requires 
about  0.3  per  cent  of  Pc-Mp  capacity  per  active  line  (each  of 
10  —  15  char/s).  In  general,  the  PDP-8  hardware  controls  are 
minimal — in  turn  fairly  elaborate  control  programs  must  be  used 
as  part  of  them. 

Computer  levels 

In  this  section  we  describe  all  the  systems  levels  in  the  PDP-8 
computer  from  the  top  down.  The  reader  should  already  have  a 
sketchy  knowledge  of  the  PDP-8  because  the  registers  and  ISP 
have  been  exposed.  Here,  we  wish  to  clarify  how  it  operates.  A 
map  of  the  hierarchy  is  given  in  Fig.  6,  starting  from  PMS  to  ISP 
and  down  through  logic  design  to  circuit  electronics.  These  de- 
scription levels  are  subdivided  to  provide  more  organizational 
detail.  For  example,  the  register-transfer  level  has  the  more  de- 
tailed registers,  data  operators,  functional  units,  and  macro  logic 
of  the  processor,  whereas  the  next  logic  level  below  has  sequential 
and  combinational  networks,  and  the  sequential  and  combinatorial 
elements. 

It  should  be  apparent  that  the  relationship  of  the  various  de- 
scription levels  constitutes  a  tree  structure  where  the  organiza- 
tionally complex  computer  is  the  top  node  and  each  descending 
description  level  represents  increasing  detail  (or  smaller  com- 
ponent size),  until  the  final  circuit  element  level  is  reached.  For 
simplicity,  only  a  few  of  the  many  possible  paths  through  the 
structural  description  tree  are  illustrated.  For  example,  the  path 
showing  mechanical  parts  is  missing.  The  path  shown  proceeds 
from  the  PDP-8  computer  to  the  processor  and  from  there  to  the 
arithmetic  unit  or,  more  specifically,  to  the  AC  register  of  the 
arithmetic  unit.  Next,  the  macro  logic  implementing  the  register- 
transfer  operations  and  functions  for  the  jth  bit  of  the  AC  is  given; 
the  flip-flops  and  gates  needed  for  this  particular  implementation 
are  shown.  Finally,  on  the  last  segment  of  the  path,  come  the 
electronic  circuits  and  components  of  which  flip-flops  and  NAND 
gates  are  constructed. 


Chapter  5  j  The  DEC  PDP-8  127 


Progfomming 


Register 
transfer 


Combinatoriol  * 
circuits  [9] 


Multivibrator  [I4J 
(active  component) 


R  (passive  component) 


t  X  ]  indicates  figure  number  of  instance 


Fig.  6.  DEC  PDP-8  hierarchy  of  descriptions. 

Abstract  representations 

Figure  6  also  lists  some  of  the  methods  used  to  represent  the 
physical  computer  abstractly  at  the  different  description  levels. 
As  mentioned  previously,  onlv  a  small  part  of  the  PDP-S  descrip- 
tion tree  is  represented  here.  The  many  documents,  schematics, 
diagrams,  etc.,  which  constitute  the  complete  representation  of 
even  this  small  computer  include  logic  diagrams,  wiring  lists, 
circuit  schematics  and  printed-circuit  board  layout  masks,  pro- 
duction description  diagrams,  production  parts  lists,  testing  speci- 
fications, programs  for  testing  and  diagnosing  faults,  and  manuals 
for  modification,  production,  maintenance,  and  use.  As  the  discus- 
sion continues  down  the  abstract  description  tree,  the  reader  will 
observe  that  the  tree  conveniently  represents  the  constituent  ob- 
jects of  each  level  and  their  interconnection  at  the  next  highest 
level.  Each  level  in  the  abstract-description  tree  will  be  described 
in  order. 

I7»t>  PMS  level 

The  simplified  PMS  stnicture  in  Fig.  .3  has  been  reduced  from 
Fig.  1.  The  computer  is  small  enough  so  that  the  physical  delinea- 
tion of  the  PMS  components,  such  as  K's  and  S's,  is  less  pro- 
nounced than  in  larger  systems.  In  fact,  in  the  case  of  the 
S('Memory  Bus,  'I/O  Bus),  the  S's  are  actually  within  the  K  and 


Mp,  as  shown  in  Fig.  5.  The  implementation  of  these  switches 
within  the  K  and  Mp  was  shown  in  Fig.  5.  In  Fig.  7  we  present 
a  more  conventional  functional  diagram  and  the  equivalent  PMS 
diagram  of  the  computer,  with  Pc  decomposed  into  K,  processor 
state  (Mps),  and  D.  The  functional  diagram  has  the  same  compo- 
nents of  the  characteristic  elementary  computer  model,  namely, 
K,  D,  M,  and  T(input,  output).  These  figures  give  a  somewhat 
general  idea  of  what  processes  can  occur  in  the  computer,  and 
how  information  flows,  but  it  is  apparent  that  at  least  another 
level  is  needed  to  describe  the  internal  structure  and  behavior  of 
the  Mp  and  Pc.  We  should  look  at  these  primitives  (although  still 
together  as  a  C)  at  the  register-transfer  level. 

Programming  level  (ISP) 

The  ISP  interpretation  is  given  in  .\ppendi.\  1  of  this  chapter  and 
is  the  specification  of  the  programming  machine.  In  addition,  it 
constrains  the  physical  machine  s  behavior  to  have  a  particular 
ISP.  The  ISP  has  been  discussed  earlier  in  the  chapter. 

Register-transfer  level 

The  C  can  also  be  represented  at  the  register-transfer  level  by 
using  PMS.  Figure  4  (by  DEC)  shows  the  register-transfer  level; 


Primary 
memory 


r*  Processor  state 


~1 


Doto  operotions 
{arithinetic  ond 
logical) 


Input-ouTput,  ond 
secondory  memory 


I  


Control   


Mp- 
t 


T.  console 

M.  processor,_,reqisters 

D  Ms" 

t  ♦ 

K  J 

t 

T.  Clock 


Fig.  7,  DEC  PDP-8  function  block  and  PMS  diagrams,  {a)  Processor 
functional  block  diagram,  (b)  Pc  PMS  diagram. 


128  Part  2  I  The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


L  ('Memory  Bu: 


r*T('Sen5euamplifier'(0:i{» 
t 

I  MCCore^stack  ;12b;4096w) 


•-sfco-ordinatei^selec- 
t  tion,  .nput  :12b; 
[output-  64  +  64 


h^T  ('Inhibit_drivers<0:11> ) 
I  t 


I 

To  Mp(»i:7) 


-M ('Memory  buffer /MB  <o:i1> ;  flip  flop)  

jrCMB-.operotions  ,('MB»  0,'MB—  MB  + 1 ,  1 
■  'MB-^PC,'MB-^M[MA],'MB-^DBi_jdato, 
'MB  — AC)  J 

-MCMemory  oddress / MA<0;11>  ) 
'MA;  operations:('MA<0:4>- 
I    '  ♦  I  'MA-^PC,'MA»-MB,'MA<5:11>— MB<5;11> 
I  -MA—DB  oddri 

□  (•Instruction  register  decode) 
M  ('Instruction  register /IR<0,2>  .  flip  flop) 


"DCIRl  operations  :("IR-^0,'IIV-m[Ma]<0  2>)) 


MpsCLink/L) 

D  ['Link/ L.operot  ions:  (L-^0iL--1 1 
L— ^L)  /] 


3s('AccumuloTor/AC <0: 1 1> ;  f lip  flop) 

■AC,operations:('AC— 0;'AC-^7777a 
'AC-^  -lAC'AC-^AC-H', 
'LoAC-^LoAC  «2  (totote), 
'L"AC-^  LOAC  «4  Irototc), 
'LoAC-^  LoflC(;«12  Irotate), 
'LoAC-^  LoaC  C4  (rototji, 
'AC-^ACe  MB.AC-^^iSC^MB, 
'AC-^Corry  (AC, MB), 
'AC-^AC  Doto^witches 


MpsCProgrom  counter/PC <01l)>;  flip  flop 

'  'PC ,  operotlons :  ("PC— 0)'PC—  PC  + 1 
♦  'PC<0:4>-^0,'PC— MB 

PC<5;ii>-^MB<5:ii>) 


Inputs  from  other  K's 


1  K fdSP, Mp,S('I/0  bus ) ,T  console  ;'Doto 
J  j  [breok;  M(working;'Sfatei_iregisfer3 

T  (clock) 
1 

X(time) 


LCDcTto  Breok/DB)-  = 

—  LCrequest.'direction, cycle-select  <o:i>  )-^ 
L  ('oddress  -  accepted word -Count _ow,'  break -stote  )  - 
—  L('MB<0  11  >,  output)  — 
■  L('0B-0ddress<0:l1>,  input)  — 
LCDB-doto  <0:i1>;  input)— 


'MA,'MB,'AC,'L,'PC 
'State  .-register; 
Run  .'Interrupt- 
stote 


Fig.  8.  DEC  PDP-8  register-transfer-level  PMS  diagram. 


only  registers,  operations,  and  L's  are  important  at  this  level.  We 
still  lack  information  about  the  conditions  under  which  operations 
are  evoked.  Figure  8  is  a  PMS  diagram  of  Pc-Mp  registers.  Here 
we  show  considerably  more  detail  (although  we  do  not  bother  with 
electrical  pulse  voltages  and  polarities)  than  in  Fig,  4.  We  declare 
the  Pc  state  (including  the  temporary  register)  within  Pc,  The 
figure  also  gives  the  permissible  data  operations,  D,  which  are 
permitted  on  the  registers.  It  should  be  clear  from  this  that  the 
logical  design  level  for  the  registers  and  the  operators  can  easily  be 
reached.  The  K  logic  design  cannot  be  reached  until  we  use  the 
programming  level  constraints  (ISP),  thus  defining  the  conditions 
for  evoking  the  data  operators. 

The  core  memory.  The  Mp  stmcture  is  given  in  Fig,  8,  A  more 
detailed  block  diagram  which  shows  the  core  stack  with  its  twelve 


64  X  64  1-bit  core  planes  is  needed.  Such  a  diagram,  though  still 
a  functional  block  diagram,  takes  on  some  of  the  aspects  of  a 
circuit  diagram  because  a  core  memory  is  largely  circuit-level 
details.  The  Mp  (Fig,  9)  consists  of  the  component  units:  the  two 
address  decoders  (which  select  1  each  of  64  outputs  in  the  X  and 
Y  axis  directions  of  the  coincident  current  memory);  selection 
switches  (which  transform  a  coincident  logic  address  into  a  high- 
current  path  to  switch  the  magnetic  cores);  the  12  inhibit  drivers 
(v^hich  switch  a  high  current  or  no  current  into  a  plane  when 
either  a  0  or  I  is  rewritten);  12  sense  amplifiers  (which  take  the 
induced  low  sense  voltage  from  a  selected  core  from  a  plane  being 
switched  or  not  switched  and  transform  it  into  a  1  or  0);  and  the 
core  stack,  an  array  M[0:7777g]<0:ll).  Since  this  is  the  only  time 
the  Mp  is  mentioned,  Fig.  9  also  includes  the  associated  circuit- 
level  hardware  needed  in  the  core-memory  operation,  such  as 


Chapter  5  |  The  DEC  PDP-8  129 


power  supplies,  timing,  and  logic  signal  level  conversion  amplifiers. 
The  timing  signals  are  generated  within  Pc(K)  and  are  shown 
together  with  Pc's  clock  in  Fig.  10. 

The  process  of  reading  a  word  from  memory  is: 

1  A  12-bit  selection  address  is  established  on  the  MA<0:11) 
address  lines,  which  is  1  of  lOOOOj,  (or  4()96jf,)  unique  num- 
bers. The  upper  6  bits,  <():5),  select  1  of  64  groups  of  Y 
addre.sses  and  the  lower  6  bits,  <6:11),  select  1  of  64  groups 
of  X  addresses. 

2  The  read  logic  signal  is  made  a  1. 

3  A  high-current  path  flows  via  the  X  and  Y  selection 
switches.  In  each  of  the  X  and  Y  directions  64  X  12  cores 


have  selection  current.  Only  one  core  in  each  plane  is 
selected  since  Ix  =  ly  =  Iswitching/2,  and  the  current  at 
the  selected  intersection  =  Ix  -I-  ly  =  Iswitching. 

4  If  a  core  is  switched  to  0  (bv  having  Iswitching  amperes 
through  it  ),  then  a  1  was  present  and  is  read  at  the  output 
of  the  plane  (bit)  sense  amplifiers.  A  sense  amplifier  receives 
an  input  from  a  winding  that  threads  every  core  of  every 
bit  within  a  core  plane  [0:7777^].  All  12  cores  of  the  selected 
word  are  reset  to  0.  The  sense  time  at  which  the  sense 
amplifier  is  observed  is  tms  (memory  strobe),  and  the  strobe 
in  effect  creates  MB  <-  M[MA]. 

5  The  read  current  is  turned  off. 


Fig.  9.  DEC  PDP-8  four-wire  coincident  current  (three  dimensions)  core-niemory4ogic  block  diagram. 


130  Part  2     The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


Ciocli 

pulses       |(t2),     ,  |(tm5) 
0 

Reod  I 


|(tl) 


[(tmd)  |(t2) 


1.5  time 
(MS) 


Write 
Inhibit 
Memory 


I  [MB—  m[ma1 


-Memory  cycle - 


Fig.  10.  DEC  PDP-8  clock  and  memory  timing  diagram. 


6  The  write  and  inhibit  logic  signals  are  turned  on.  The  bit 
inhibit  signal  is  present  or  not,  depending  on  whether  a  0 
or  1,  respectively,  is  written  into  a  bit. 

7  A  high-current  path  flows  via  the  X  and  Y  selection 
switches,  but  in  an  opposite  direction  to  the  read  case  (2 
above).  If  a  1  is  written,  no  inhibit  current  is  present,  and 
the  net  current  in  the  selected  core  is  —  Iswitching.  If  a 
0  is  written,  the  current  is  —Iswitching  -l-(Iswitching/2) 
and  the  core  remains  reset. 

8  The  inhibit  and  write  logic  signals  are  turned  oflF,  and  the 
memorv  cycle  is  completed. 

Registers  and  operations.  As  Fig.  8  shows,  the  registers  in  the  Pc 
cannot  be  uniquely  assigned  to  a  single  function.  In  a  minimal 
machine  such  as  the  PDP-8,  fimctional  separation  is  not  economi- 
cal. Thus  there  are  not  completely  distinct  registers  and  transfer 
paths  for  memory,  arithmetic,  and  program  and  in.struction  flow. 
(This  sharing  complicates  understanding  of  the  machine.)  How- 
ever, Fig.  8  clarifies  the  structure  considerably  by  defining  all  the 
registers  in  Pc  (including  temporaries).  For  example,  the  Memory 
Buffer/MB  is  used  to  hold  the  word  being  read  from  or  written  to 
Mp.  MB  also  holds  one  of  the  operands  for  binary  operations  (for 
example,  AC  «— AC  A  MB).  MB  is  also  used  as  an  extension  of 
the  Instruction  Register/IR  during  the  instruction  interpretation. 
The  adchtional  registers,  not  in  the  ISP,  are: 


Memory  Bufrer/MB<0:11> 

Memory  Address/MA<0:11> 
Instruction  Register/IR<0;2> 


holds  memory  data, 
instn^iction,  and  oper- 
ands 

holds  address  of  word 
in  Mp  being  accessed 

holds  the  value  of 
current  instruction 
being  performed 


State^registerj 


Fetch/F  :  —  (State^register  =  0) 

Defer/D/Indirect 

:  =  (State„register  =  I) 

Execute/E 

:=  (State^register  =  2) 


a  ternary  state  register 
holding  the  major 
state  of  memory  cycle 
being  performed 

memorv  cvcle  to 
fetch  instruction 

memorv  cycle  to  get 
address  of  operand 

memory  cycle  to  fetch 
(store)  operand  and 
execute  the  instruc- 
tion 


Figure  8  has  been  concerned  with  the  static  definition  (or 
declaration)  of  the  information  paths,  the  operations,  and  state. 
The  ISP  interpretation  (Appendix  1)  is  the  specification  for  the 
physical  machine's  behavior.  As  the  temporary  hardware  registers 
are  added,  a  more  detailed  ISP  definition  could  be  given  in  terms 
of  time  and  temporary  registers.  Instead,  we  give  a  state  diagram 
(Fig.  11)  to  define  the  actual  Pc  which  is  constrained  by  both  the 
ISP  registers,  the  temporary  registers  implied  by  the  implementa- 
tion, and  time.  The  relationship  among  the  state  diagram,  the  ISP 
description,  and  the  logic  is  shown  in  the  hierarchy  of  Fig.  6.  In 
the  relationships  of  the  figures,  we  observe  that  the  ISP  definition 
does  not  have  all  the  necessary  detail  for  fully  defining  a  physical 
Pc.  The  physical  Pc  is  constrained  by  actual  hardware  logic  and 
lower-level  details  even  at  the  circuit  level.  For  example,  a  core 
memory  is  read  by  a  destructive  process  and  requires  a  temporary 
register  (MB)  to  hold  the  value  being  rewritten.  This  is  not  repre- 
sentable  within  a  single  ISP  language  statement  since  we  define 
only  the  nondestructive  transfer  <— ,  but  it  can  be  considered  as 
the  two  parallel  operations  MB«— M[MA];  M[MA]<— 0.  The 
problem  of  explaining  rewriting  of  core  using  ISP  is  also  difficult, 
because  explicit  time  is  not  in  the  ISP  language  (although  we  can 
define  clock  events,  or  at  least  relative  time). 

The  state  diagram  (Fig.  11)  describes  the  implementation  be- 
havior using  the  registers  and  register  operations  (Fig.  8)  and  the 
temporary  registers  declared  above. 

The  implementation  is  fimdamentallv  Mp-timing-based,  as  we 
see  from  both  the  state  diagram  and  the  times  when  the  four  clock 
signals  are  generated  (Fig.  10).  Thus  there  are  three  (State^regis- 
ter  =  0,1,2)  X  4  (clock),  that  is,  12  major  states,  in  the  implemen- 
tation. We  use  the  IR  to  obtain  two  more  states,  F2b  and  F3b, 


Chapter  5  |  The  DEC  PDP-8  131 


Enecution  memory 


Ttmd— {(MA  — PC);  t'l 

tod-»l 

ond— ^( 

AC— coTy(AC^)l), 

AC— AC»Ke)li 

( jms  V  dca  v 
M(MA)— MB)). 


^  State ^ 

■egister— 2); 
i  (to  EO) 


t2— ( 

MB— 0,  IR— 0. 
5  StQte^register- 0);(to  F 


Note;  Stote  diogrom  does  not  include 
Dotd  Breok.  Interrupt,  ond  EAE 


Fig.  11.  DEC  PDP-8  Pc  state  diagram. 


for  the  description.  The  State^register  values  0,  1,  and  2  corre- 
spond to  fetching,  deferring  (indirect  addressing,  i.e.,  fetching  an 
operand  address),  and  executing  (fetching  or  storing  data,  then 
executing)  the  instruction.  The  state  diagram  does  not  describe 
the  Extended  Arithmetic  Element/E.\E  operation,  the  internipt 
state,  and  the  data  break  states  (these  add  12  more  states).  The 
initialization  procedure,  including  the  T. console  state  diagram,  is 
also  not  given.  One  should  observe  that  when  t2  occurs  at  the 
beginning  of  the  memory  cycle,  a  new  State^register  value  is 
selected.  The  State„register  value  is  always  held  for  the  remainder 
of  the  cycle;  i.e.,  only  the  sequences  (FO  — >  Fl  ^  F2  — *  F3  or 
DO  ^  Dl  ^  D2     D3  or  EO  ^  El      E2  ^  E3)  are  permitted. 

Figure  8  alludes  to  Pc(K),  that  is,  the  sequential  network  used 
for  controlling  Pc.  The  inputs  and  the  present  state  (including 
clocks)  determine  the  operations  to  be  issued  on  the  registers. 


Logic  design  level  (registers  and  data  operations) 

Proceeding  from  the  register-transfer  and  ISP  descriptions,  the 
next  level  of  detail  is  the  logic  module.  Typical  of  the  level  is  the 
1-bit  logic  module  for  an  accumulator  bit,  AC<j>,  illustrated  in 
Fig.  12.  The  horizontal  data  inputs  in  the  figure  are  to  the  logic 
module  from  AC<j>,  MB<]>,  lO  Bus<j>,  and  Data„switch<j>.  The 
vertical  control  signal  inputs  command  the  register  operations  (i.e., 
the  transfers);  they  are  labeled  by  their  respective  ISP  operations 
(for  example,  AC  ^  MB  A  AC,  AC AC  X  2  {rotate}).  The 
sequential  network  Pc(K)  (Fig.  8)  generates  these  control  signal 
inputs. 

Logic  design  level  {Pc  control,  Pc{K)  sequential  network) 

The  output  signals  from  the  Pc(K)  (Fig.  8)  can  be  generated  in 
a  straightforward  fashion  by  foniiulating  the  boolean  expressions 


132  Part  2  I  The  instruction-set  processor:  main  line  computers 


Section  1     Processors  with  one  address  per  instruction 


Bus  to  each  bit  of  AC 


AC<J> 
carry 
output 

TAC<J>--nyT 
MB<J>-jAr 

AC<J> 


AC<J  >' 
carry 
input 


"3^ 


AC<J-1> 
Doto^switch  ^ 


<J> 
10  Bus  AC 
input<J> 


AC<>1>-^ 


AC-^AC  V 
Dato^switch: 


Aa— AC  A  MB; 


AC 

■AC-AC®MB 

LaC»AC/2  {  rotate  } 
AC-AC  X  2  {rotate  > 
'AC-^ACt1  IS  formed  by  AC<12>  carry  input 


AC-^Corry(AC,I^B) 


AC 

■^^'*~'-*'|  commands 


Fig.  12.  DEC  PDP-8  AC<J>  bit  register-transfer  logic  diagram. 


'AC-^O  :=  ( 


(t1  A  (IR  =  111)  A      MB<3>  A  MB<4>  a  -,  I^B<6>)  A  (Stafe„register  =  0))  v 

(tl  A  (IR  =  111)  A  (MB<3>  A  ^  MB<11>  A  MB<4>)  a  (Stotejegister  =  0))  v 

=  lit)  A  (MB<3>  A  MB<lt>   A  MB<4>)a  (Stote^register  =  0)) 'v 

•   -  2)]) 


(tl  A  (IR   

(tl  A  (IR  =011)  A  (StafCL^register  ■ 


(tl  A  (((State-register  =  0)  a  (IR  =  111)  a  MB<4>  ^  (MB<3>  v  -,  MB<S>))  V 
((State-register  =  2)  a  (IR  =011)))) 

Logic  eguation  for  'AC-^0 


tl 


IR<2> 


(Stote-register  =  2)~ 


IR<0> 


IR<2> 


(Stote-register  =  0) 


MB<4> 


Logic  diogrorTi  for  AC-^0 
'Tiiis  term  is  derived  from  EAE  and  is  not  on  the  stote  diagram 


Fig.  13.  DEC  PDP-8  Pc(K)  'AC      0  signal-logic  equations  and  diagram. 


Wo/e;  This  is  not  on  "ideol"  seguentioi  circuit  element,  because  there  is  no  delay  in  the  output. 


Fig.  14.  DEC  PDP-8  sequential-element  circuit  and  logic  diagrams. 


Chapter  5     The  DEC  PDP-8  133 


Inputs 


Input  Inpu 
Output  1 
2 


NAND 


flOvoits  NAND  logic  element  NOR  logic  element 

Node 

Multiple  input  inverter  circuit 


Table  of  circuit 
behavior 


Toble  of  NAND 
behovior 


Toble  of  NOR 
behavior 


Input 
1    2  3 

Output 

Input 
1    2  3 

Output 

Input 
1    2  3 

Output 

0 

0  0 

-3 

0   0  0 

1    1  1 

0 

0 

0  -3 

-3 

0   0  1 

1    1  0 

0 

0 

-3  0 

-3 

0    1  0 

1    0  1 

0 

0 

-3  -3 

-3 

0   1  1 

1    0  0 

0 

-3 

0  0 

-3 

1    0  0 

0   1  1 

0 

-3 

0  -3 

-3 

1    0  1 

0   1  0 

0 

-3 

-3  0 

-3 

1     1  0 

0  0  1 

0 

-3 

-3  -3 

0 

1   1  1 

0 

0  0  0 

1 

Fig.  15.  DEC  PDP-8  combinational  element  circuit  and  logic  diagrams. 

directly  from  the  state  diagram  in  Fig.  11.  For  e.\ample,  the 
.\C  <— 0  control  .signal  is  e.xpressed  algebraically  and  with  a  com- 
binatorial network  in  Fig.  13.  Obviously  these  boolean  output 
control  signals  are  functions  which  include  the  clock,  the 
State_ register,  and  the  states  of  the  arithmetic  registers  (for 
example,  A  =  0,  L  =  0,  etc.).  The  expressions  should  be  factored 
and  minimized  so  as  to  reduce  the  hardware  cost  of  the  con- 
trol for  the  interpreter.  Although  we  are  rather  cavalier  about 
Pc(K),  it  constitutes  about  one-half  the  logic  within  Pc. 


Circuit  level 

The  final  level  of  description  is  the  circuits  which  form  the  logic 
functions  of  storage  (flip-flops)  and  gating  (NAND  gates).  Figures 
14  and  15  illustrate  some  of  these  logic  devices  in  detail. 

In  Fig.  14  a  direct  set  and  direct  clear  flip-flop,  a  sequential- 
logic  element,  is  described  in  terms  of  circuit  implementation, 
combinational  logic  equivalent,  a  table  of  its  behavior,  and  its 
algebraic  behavior.  Note  that  this  is  not  an  ideal  element,  be- 
cause it  has  no  delay  and  responds  directly  and  immediately  to 
an  input.  Some  idealized  sequential  logic  elements  are  used  in 
the  PDP-8  (but  not  illustrated),  including  the  RS  (Reset-Set), 
T(Trigger),  JK,  and  D(Delay).  A  delay  in  the  flip-flops  makes  them 
behave  in  the  same  way  as  the  ideal  primitives  in  sequential- 
circuit  theory.  The  outputs  require  a  series  delay.  At,  such  that, 
if  the  inputs  change  at  time  t,  the  outputs  will  not  change  until 
t  -I-  At.  In  fact,  the  PDP-8  uses  capacitor-diode  gates  at  the  flip- 
flop  inputs  to  delav  the  inputs.  U- rM.'^uw 

Figure  15  illustrates  the  combinatorial  logic  elements  used  in 
the  PDP-8.  The  circuit  selection  is  limited  to  the  inverter  circuit 
with  single  or  multiple  inputs.  These  are  more  familiarly  called 
N.\ND  gates  or  NOR  gates,  depending  on  whether  one  uses  posi- 
tive and/or  negative  logic-level  definitions. 

Conclusion 

We  could  continue  to  discuss  the  behavior  of  the  transistor  as  it 
is  used  in  these  switching-circuit  primitives  but  will  leave  that 
to  books  on  semiconductor  electronics  and  physics.  It  is  hoped 
that  the  student  has  gained  a  grasp  of  how  to  think  about  the 
hierarchical  decomposition  of  computers  into  particular  levels  of 
analysis  (and  synthesis). 


134 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1  |  Processors  with  one  address  per  instruction 


APPENDIX  1    DEC  PDP-8  ISP  DESCRIPTION 


Appendix  I 
DEC  PDP-8  ISP  Description 


Pa  State 
AC<0: 1  1> 
L 

PC<0:  1  1> 
Run 

1 n te  rrupt     t ate 

iO^ulse^i;    lO^ulseJ;  lO^pulse^') 

Mp  State 

Extended  memory  is  not  included. 

M[0:7777g]<0:  n> 

Page^0[0:1773]<0:ll>  :=  M[0 :  1  77g]<0 :  n> 

Autojndex[0:7]<fl:  n>  :=  Page  J)  [  I  0  . :  1  7  „  ]<0  :  11  > 

o  o 

Pe  Console  State 


Accumu  la  tor 

Link  hit/AC  extension  for  overflow  and  carry 
Program  Counter 

1  when  Pa  is  interpreting  instmctions  or  "running" 
1  when  Fa  can  be  interrupted;  under  programmed  aontrol 
10  pulses  to  10  devices 


special  array  of  directly  addressed  memory  registers 
special  array  when  addressed  indirectly , is  incremented  by  1 


Keys  for  start,  stop,  continue,  examine  (load  from  memory),  and  d 

iposit  (store  in  memory)  are  not  included. 

Data  swi  tches<0 : 1 1> 

data  entered  via  console 

Instruction  Format 

i  ns  t  ruct  i  on/ i  <0 : 1 1> 

op<0:2> 

=  i<0:2> 

op  code 

indi  rect^bit/ib 

=  i<3> 

0,  direct;  1  indirect  memory  reference 

page^O^b  i  t/p 

=  i<it> 

0  selects  page  0;  1  selects  this  page 

page^ddress<0  :6> 

=  i<5:  n> 

th  i  s^page<0 :  k> 

=  PC'<0:1)> 

PC'<0: 1 1> 

=   (PC<0:  1  1>  -1  ) 

IO^jSelect<0:5> 

=  i<3:8> 

selects  a  T  or  Ms  device 

i  OljP  l^jb  i  t 

=  i<l 1> 

these  S  bits  control  the  selective  generation 

io^p2^b  i  t 

=  i<10> 

0.4  p,s  pulses  to  I/O  devices 

ioji't^bi  t 

=  i<9> 

5  ma 

=  i<5> 

y.  hit  for  skip  on  minus  AC,  operate  2  group 

sza 

=  i<6> 

bit  for  skip  on  zero  AC 

sni 

=  i<7> 

hit  for  skip  on  nan  zero  Link 

Effective  Address  Calculation  Process 

z<0:ll>  :=  ( 

effective 

-lib  ->  z"; 

ib  A   (lOg  <  z"  £  17g) 

^  (M[z"]       M[z"]  -1-  1 

;  next) ; 

auto  indexing 

ib  -M[z"]) 

z'<0.-  1  1>  :=         ib  -  z"; 

ib  ->  M[z"]) 

z"<0  :  n>  :=  (page^O^b  i  t 

th  i  s^pagenpage^address  ; 

direct  address 

-npage^O^jb  i  t 

^  Onpage^address ) 

p.  miaroaoded  instruction  or  instruction  hit(s)  within  an  instruction 


Chapter  5  |  The  DEC  PDP-8  135 


APPENDIX  1    DEC  PDP-8  ISP  DESCRIPTION  (Continued) 


Inetruction 

Interpretation  Process 

Run  A  -\ 

Clnterrupti^request  A  Interrupt^state)  —  { 

no  interrupt  interpreter 

Instruction  ^M[PCJ;   PC       PC  +   1  ;  next 

fetch 

i  nst  ruct  i  on^execut  i  on ) ; 

execute 

Run  A    1 n 

terrupt^request  /\   Interrupt^state  —  ( 

interrupt  interpreter 

H[03 

-  PC; 

Interrupt^state      0;  PC  ,-  I) 

rnstruction 

Set  and  Instruction  Execution  Process 

Instruct 

on^execution  :=  ( 

and  ( 

=  op 

=  0)  -  (AC      AC  A  M[z]); 

ZooicaZ  and 

tad  ( 

=  op 

=  1)  -  (LDAC  •    LOAC  +  H[z]); 

two  s  cot^TpXef^ent  aSd 

isz  ( 

=  op 

=  2)  -.  (H[2']  -  M[2]  +  1  ;  next 

(M[z']  =  0)       (PC  .-PC  t  D): 

index  and  s^i-p  if  zero 

dca  ( 

=  op 

=  3)  -  (M[  z]  -  AC;  AC  .-  0)  ; 

deposit  and  clear  AC 

jms  ( 

=  op 

=  <()-.  (M[z]  ^  PC;  next  PC  ^  z  +  1); 

Jump  to  subroutine 

jmp  ( 

=  op 

=  5)  -  (PC  ~  z); 

Junp 

iot  ( 

=  op 

=  6)  -  ( 

li  in  out  transfer^  microprograjmed  to  generate  up  to  Z  pulses 

io^pt^bi 

t  —  lO^pulse^l  —  1;  next 

to  an  10  dsvtoe  addressed  hy  TO^jse lect 

i  o>-^j2^  i 

t  -  I0„pulse„2  <-  1  ;  next 

io^p't^b  i 

t       lOjjuIseJ       1)  ; 

opr  (: 

=  op 

=  7)  — Operate^execut ion 
) 

the  operate  instruction  is  defined  belou 
end  Instruction  execution 

Operate  Tnstruation  Set 

The  micr^prograrmed  operate  instructions:    operate  group  I, 
instruction  set. 

operate  group  2,  and  extended  arithmetic  are  defined  as  a  separate 

Operate^execut 

on   :=  ( 

cla  (: 

=  i<'l>  =   1)       (AC  >-  0)  ; 

cZear  AC.     CoTnmon  to  aZZ  operate  instructions. 

op  f^' 

(:=  ;<3>  =  0)  ( 

operate  group  7 

cl  1 

(:  = 

<S>   =   1)        {L       0):  next 

\i  cZear  link 

cma 

(:  = 

<6>  =   1)   ->  (AC  AC)"; 

u  complement  AC 

cml 

{:  = 

<7>  =   I  )   ^  (L   .-^  L)  ;  next 

u  compZement  L 

i  ac 

(:  = 

<I  1>  =   1  )        (LdAC       Lc^C  +   I  )  ;  next 

a  increment  AC 

ral 

(:  = 

<8:10>  =  2J       (LDAC  -  LDAC  x  2  [rotate}); 

u  rotate  Zeft 

rtl 

(:  = 

<8:I0>  =  3)  ^  (LDAC  -LDAC  ^  2^  {rotate'')  : 

u  rotate  twice  Zeft 

rar 

(:  = 

<8:10>  =  k)   "  (LDAC  -  LDAC  /  2  [rotate]); 

u  rotate  right 

rt  r 

(:  = 

<8:I0>  =  5)  -'  (LDAC  -LGAC  /  2^  [rotate!)) 

:           li.  rotate  twice  right 

Opr  ^2 

(:= 

3.n>  =  10)  -  { 

operate  group  2 

sUp  cond 

it  ion  e  (i<8>  =  ])  -  (PC  ^  PC  +  1) ;  next 

Li  AC^L  skip  test 

skip  condition   :=   ((sna  ^   (AC  ■.   0))   /    (sza  A  (AC 

=  0))  w   (sni  A  L)) 

OS  r 

(:=  \<.S>  =  1)  ^  (AC  -  AC  V  Oata  switches); 

IX  switches 

hit 

(;=  i 

<10>=   1)  _  (Run  -  0)); 

u.  halt  or  stop 

=  i<3,ll>  =  I  1 )  E^^instructionciexecution) 

crticnaZ  FAE  descrirticK 

136 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1  |  Processors  with  one  address  per  instruction 


APPENDIX  1    DEC  PDP-8  ISP  DESCRIPTION  (Continued) 


KT  and  KMs  State 

Each  K  may  have  any  or  alt  of  the  following  register's .     There  can 

be  up  to  64  optional  K's. 

Input  jdata  [0:77g]<0: 1  1> 

64  input  buffers 

Output  jiat a  [0:77 g]<0:  "> 

64  output  buffers 

IO^I<ip  Jlag[0:77g] 

64  test  conditions 

1 0    n terrupt  ^request [0: 77 

8 

2  signifies  a  reauest.    If  interrupt ^tate  =  1,  then  an 
interrupt  occurs. 

Interrupt^request   :=  ( 

"or''  of  all  reauests  from  each  10  device 

max(  lO^i  nterrupt^reques  t  [0:  77g]) ) 

Extended  Arithmetic  Element,  EAE  (optional) 
Provides  additional  arithmetic  instructions  (or  operators)  including       / ,  normalize ,  logical  shift  and  arithmetic  shift. 

EAE  State 

MQ<0:  1 1> 

Multi'Xjl'V er  (Quotient 

$C<D:h> 

Shift  Counter 

Instruction  Format  and  Data 

md5<0 :  n> 

multijjlier  divisor  shift  data 

S<0:'|>  :=  mds<7: 1  1> 

shift  count  parameter 

Instruction  Set  for  EAE 

EAEj^i  ns  t  rue  t  i  onuexecu  t  i  on  :=  (next 

mqa  (:=  !<S>)  ^  (AC  «- AC  v  MQ)  ; 

h'O  into  AC 

sea   (:=   i<6>)  ->  (AC  <- AC  V  5C)  : 

SC  into  AC 

mql    (:=  i<7»  ^  (MQ  ^- AC ;  AC  <- 0)  ;  next 

AC  into  "Cj  clear  AC 

Note  only  one  of  nmi,  shl,  asr,  Isr,  muy,  or  dvi  can  be  given  at 

a  time. 

i  <8 : 1 0>  =  00$  ^  ; 

10  operation 

-,  nmi  -»(mds  ^   M[PC]:  PC  <- PC  +  1);  next 

muy   (:=  i<8:10>  =  2)  ->  (LQACdMQ  <-  MQ  x  mds :   5C  .-0) 

multiply 

dvi    (:=   i<8:10>  =  3)  ^  (MQ      LaACaMO/mds ; 

divide 

LdAG  <- LdACdMQ  mod  mds:   SC  <- O)  : 

nmi    (:=  i<8:10>  =<()->(  ACdMQ  .- normal  i  ze  (ACaMQ)  ; 

normalize  (AC, MO)  into  SC 

SC  ^ norma  1 i ze^exponent (ACQMQ) ) ; 

shI    (:=  i<8:10>  =  5)  -»  (LoACnMQ  -  LpACoMQ  x  2^"^^   SC  ^  O)  ; 

shift  left 

asr   (:=  i<8:10>  =  6)  ^  (LqACqMQ  ,- LnACnMQ  /  2^"^':   SC  ,^0); 

shift  right 

Isr   (:=  i<8:10>=  7)  ->  (LnACnMQ      LdACdMQ  /  2^*  \  1  oq  i  ca  1  }  ; 

logical  shift 

SC 

) 

end  EAE  instruction  execution 

Chapter  6 

The  Whirlwind  I  computer^ 


R.  R.  Everett 

Project  Whirlwind  is  a  high-speed  computer  activity  sponsored 
at  the  Digital  Computer  Laboratory,  formerly  a  part  of  the  Servo- 
mechanisms  Laboratory,  of  the  Massachusetts  Institute  of  Tech- 
nology (M.LT.)  by  the  Office  of  Naval  Research  (O.N.R.)  and  the 
United  States  Air  Force.  The  project  began  in  1945  with  the 
assignment  of  building  a  high-qualitv  real-time  aircraft  simulator. 
Historically,  the  project  has  always  been  primarily  interested  in 
the  fields  of  real-time  simulation  and  control;  but  since  about  the 
beginning  of  1947  most  of  its  efforts  have  been  devoted  to  the 
design  and  construction  of  the  digital  computer  known  as  Whirl- 
wind I  (WWl).  This  computer  has  been  in  operation  for  about 
1  year  and  an  increasing  proportion  of  project  effort  now  is  going 
into  application  studies. 

.\pplications  for  digital  computers  are  found  in  many  branches 
of  science,  engineering,  and  business.  .Although  any  modern  gen- 
eral-purpose digital  computer  can  be  applied  to  all  these  fields, 
a  machine  is  generally  designed  to  be  most  suited  to  some  particu- 
lar area.  Whirlwind  I  was  designed  for  use  in  control  and  simula- 
tion work  such  as  air  traffic  control,  industrial  process  control,  and 
aircraft  simulation.  This  does  not  mean  that  Whirlwind  will  not 
be  used  on  applications  other  than  control.  About  one-half  the 
available  computing  time  for  the  ne.xt  year  will  be  assigned  to 
engineering  and  scientific  calculation  including  research  in  such 
uses  supported  bv  the  O.N.R.  through  the  M.I.T.  Committee  on 
Machine  Methods  for  Computation. 

These  control  and  simulation  problems  result  in  a  specialized 
emphasis  on  computer  design. 

Short  register  length 

WWI  has  16  binary  digits  and  the  control  problems  are  usually 
very  simple  mathematically.  Furthermore,  the  computer  is  almost 
always  part  of  a  feedback  rather  than  an  open-ended  system. 
Consequently,  roundoff  errors  are  seldom  troublesome  and  the 
register  length  can  be  shortened  to  something  comparable  to  the 
sensitivity  of  the  physical  quantities  involved,  perhaps  five  decimal 
places  or  less. 

WWI  has  a  register  length  of  16  binary  digits  including  sign 
or  about  four  and  one-half  decimals.  The  register  length  was 
^AlEE-IRE  Conf.,  70-74  (1951) 


chosen  as  the  minimum  that  would  provide  a  usable  single-address 
order,  in  this  case  five  binary  digits  for  instruction  and  11  binary 
digits  for  address.  In  a  future  machine  we  would  probably  increase 
this  register  length  to  20  or  24  binary  digits  to  get  additional  order 
fle.xibility;  the  increased  numerical  precision  is  less  important. 

For  scientific  and  engineering  calculation,  greater  than  16-digit 
precision  is  often  required.  There  is  available  a  set  of  multiple- 
length  and  floating  point  subroutines  which  make  the  use  of 
greater  precision  very  easy.  It  is  true  that  these  subroutines  are 
slow,  bringing  effective  machine  speed  down  to  about  that  ob- 
tained by  acoustic  memory  machines.  It  is  much  more  efficient 
occasionally  to  waste  computing  time  this  way  than  continuously 
to  waste  a  large  part  of  the  storage  and  computing  equipment  of 
the  machine  by  providing  an  unnecessarily  long  register. 

High  operating  speed 

WWI  performs  2(),()00  single-address  operations  per  second.  Con- 
trol and  simulation  problems  require  very  high  speeds.  The  neces- 
sary calculations  must  be  carried  out  in  real  time;  the  more  com- 
plex the  controlled  system  is,  the  faster  the  computer  must  be. 
There  is  no  practical  upper  limit  to  the  computing  speed  that 
could  be  used  if  available. 

Where  the  problems  are  large  enough,  and  these  problems  are, 
one  high-speed  machine  is  much  better  than  two  simpler  machines 
of  half  the  speed.  Communication  between  machines  presents 
many  of  the  same  problems  that  communication  between  human 
beings  presents. 

Great  effort  was  put  into  WWI  to  obtain  high  speed.  The  target 
speed  was  .50,000  single-address  operations  per  second,  and  all 
parts  of  the  machine  e.xcept  storage  meet  this  requirement.  The 
actual  WWI  present  operating  speed  of  20.000  single-address 
operations  per  second  is  on  the  lower  edge  of  the  desired  speed 
range. 

Large  internal  storage 

WWI  now  has  1,280  registers.  A  large  amount  of  high-speed  in- 
ternal storage  is  needed  since  it  is  not  in  general  possible  to  use 
slow  auxiliary  storage  because  of  the  time  factor.  In  many  cases 
a  magnetic  drum  can  be  useful  since  its  access  time  is  short  com- 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1      Processors  with  one  address  per  instruction 


pared  to  the  response  times  of  real  systems.  Even  with  a  drum 
there  is  considerable  loss  of  computing  and  programming  efficiency 
due  to  shuffling  information  back  and  forth  between  drum  and 
computer. 

WWI  is  designed  for  2,048  registers  of  storage.  Until  recently 
there  has  been  available  only  about  300  registers.  This  number, 
while  small,  has  been  adequate  for  much  useful  work.  Very  re- 
cently a  second  bank  of  new-model  storage  tubes  has  been  added. 
These  new  tubes  operate  at  1,024  spots  per  tube  bringing  the  total 
WWI  storage  to  1,280  registers.  These  tubes  have  been  in  the 
computer  and  under  test  for  2  months  and  in  active  use  for  about 
2  weeks.  In  the  next  few  months  the  tubes  in  the  first  bank  will 
be  replaced  by  new-model  storage  tubes  bringing  the  total  storage 
to  2,048.  This  number  is  on  the  lower  end  of  what  the  project 
considers  desirable.  What  the  computer  business  needs,  has 
needed,  and  will  probably  always  need  is  a  bigger,  better,  and 
faster  storage  device. 

Extreme  reliability 

In  a  system  where  much  valuable  property  and  perhaps  many 
human  lives  are  dependent  on  the  proper  operation  of  the  com- 
puting equipment,  failures  must  be  very  rare.  Furthermore,  check- 
ing alone,  however  complete,  is  inadequate.  It  is  not  enough 
merely  to  know  that  the  equipment  has  made  an  error.  It  is  very 
unlikely  that  a  man,  presumably  not  too  well  suited  to  the  work 
during  normal  conditions,  can  handle  the  situation  in  an  emer- 
gency. Multiple  machines  with  majority  rule  seem  to  be  the  best 
answer.  Self-correcting  machines  are  a  possibility  but  appear  to 
be  too  complicated  to  compete,  especiallv  as  they  provide  no 
standby  protection. 

The  characteristics  of  the  Whirlwind  I  computer  may  be  re- 
capitulated as  follows: 

Register  length         16  binary  digits,  parallel 

Speed  20,000    single-address    operations  per 

second 

Storage  capacity       Originally  256  registers 
Recently  320  registers 
Presently  1,280  registers 
Target  2,048  registers 

Order  type  Single-address,  one  order  per  word 

Numbers  Fixed  point,  9's  complement 

Basic  pulse  1  megacycle 

repetition  2  megacycles  (arithmetic  element  only) 
frequency 


Tube  count  5,000,  mostly  single  pentodes 

Crystal  count  11,000 

There  are  32  possible  operations,  of  which  about  27  are  as- 
signed. They  are  of  the  usual  types:  addition,  subtraction,  multi- 
plication, division,  shifting  by  an  arbitrary  number  of  columns, 
transfer  of  all  or  parts  of  words,  subprogram,  and  conditional 
subprogram.  There  are  terminal  equipment  control  orders  and 
there  are  some  special  orders  for  facilitating  double-length  and 
floating-point  operations. 

One  way  to  increase  the  effective  speed  of  a  machine  is  to 
provide  built-in  facilities  for  operations  that  occur  frequently  in 
the  problems  of  interest.  An  example  is  an  automatic  co-ordinate 
transformation  order.  The  addition  of  such  facilities  does  not  afi^ect 
the  general-purpose  nature  of  the  machine.  The  machine  retains 
its  old  flexibility  but  becomes  faster  and  more  suited  to  a  certain 
class  of  problems. 

From  March  14,  1951,  at  which  time  we  began  to  keep  detailed 
records,  until  November  22,  1951  a  total  of  950  hours  of  computer 
time  were  scheduled  for  applications  use.  The  machine  has  been 
rimning  on  two  shifts  or  a  total  of  about  3,000  hours  during  this 
interval.  The  two-thirds  time  not  used  for  applications  has  been 
used  for  machine  improvement,  adding  equipment,  and  preventive 
maintenance. 

Of  the  950  hours  available,  500  have  been  used  by  the  scientific 
and  engineering  calculation  group,  the  rest  for  control  studies.  The 
limited  storage  available  until  recently  has  been  admittedly  a 
serious  handicap  to  the  scientific  and  engineering  applications 
people.  There  has  not  been  room  in  storage  for  the  lengthy  sub- 
routines necessary  for  convenient  use  of  the  machine.  The  largest 
part  of  their  time  has  been  spent  in  training,  in  setting  up  pro- 
cedures, and  in  preparing  a  library  of  subroutines. 

A  partial  list  of  the  actual  problems  carried  out  by  the  group 
includes: 

1    An  industrial  production  problem  for  the  Harvard  Eco- 
nomics School 


2  Magnetic  flux  density  study  for  our  magnetic  storage  work 

3  Oil  reservoir  depletion  studies 

4  Ultra-high  frequency  television  channel  allocation  investi- 
gation for  Diuiiont 

5  Optical  constants  of  thin  metal  films 

6  Computation  of  autocorrelation  coefficients 

7  Tape  generation  for  a  digitally-controlled  milling  machine 


Chapter  6  |  The  Whirlwind  I  computer  139 


The  scientific  and  engineering  applications  time  on  Whirlwind 
I  has  been  organized  in  a  manner  patterned  after  that  originated 
by  Dr.  Wilkes  at  EDSAC.  The  group  of  programmers  and  mathe- 
maticians assigned  to  WWI  assist  users  in  setting  up  their  own 
problems.  Small  problems  requiring  onlv  a  few  seconds  or  minutes 
of  computer  time  are  encouraged.  Applications  time  is  assigned 
in  1-hour  pieces  two  or  three  times  a  day.  No  program  debugging 
is  allowed  on  the  machine.  Program  errors  are  deduced  by  the 
programmer  from  printed  lists  of  results,  storage  contents,  or  order 
sequences  as  previously  requested  from  the  machine  operator.  The 
programmer  then  corrects  his  program  which  is  remn  for  him 
within  a  day  or  perhaps  within  a  few  hours. 

Every  effort  is  made  to  reduce  the  time-consuming  job  of  print- 
ing tabulated  results.  In  many  cases  a  user  desires  large  amounts 
of  tabulated  data  only  because  he  doesn't  really  know  what  an- 
swers he  wants  and  so  asks  for  everything.  Such  users  are  encour- 
aged to  ask  onlv  for  pertinent  results  in  the  form  of  numbers  or 
curves  plotted  by  the  machine  on  a  cathode-ray  tube  and  auto- 
maticallv  photographed.  If  these  results  prove  inadequate  or  the 
user  gets  a  better  idea  of  his  needs,  he  is  allowed  to  renm  his 
program,  again  asking  only  for  what  appear  to  be  significant  re- 
sults. Figure  I  shows  a  sample  curve  plotted  by  the  computing 
machine  showing  calibrated  axes  and  decimal  intercepts. 


fBIZS 

•    •  • 

*3HB5  ^ 

-QOQ0 
-MDBZ 
-7Hlfi 
SHBB 

>                                                       •  • 

V 

ARITHMETIC 

CONTROL 

STORAGE 

ELEMENT 

DIGIT    TRANSFER  BUS 


Fig.  1.  Sample  computer  output. 


Fig.  2.  Simplified  computer  block  diagram. 
WWI  system  layout 

Figure  2  shows  the  major  parts  of  any  computer  such  as  WWI. 
The  major  elements  of  the  computer  communicate  with  each  other 
via  a  central  bus  system. 

WWI  is  basically  a  simple,  straightforward,  standard  machine 
of  the  all-parallel  type.  Unfortunately,  the  simple  concept  often 
becomes  complicated  in  e.xecution,  and  this  is  tnie  here.  WW'  s 
control  has  been  complicated  by  the  decision  to  keep  it  completely 
flexible,  the  arithmetic  element  by  the  need  for  high  speed,  the 
storage  bv  the  use  of  electrostatic  storage  tubes,  the  terminal 
equipment  by  the  diversity  of  input  and  output  media  needed. 

Control 

The  WW  control  is  divided  into  several  parts,  as  shown  in  Fig.  .3. 
Central  control 

The  central  control  of  the  machine  is  the  master  source  of  control 
pulses.  When  necessarv'  the  central  control  allows  one  of  the  other 
controls  to  frmction.  In  general  there  is  no  overlapping  of  control 
operation;  except  for  teniiinal  equipment  control,  only  one  of  the 
controls  is  in  operation  at  any  one  time. 

Storage  control 

Storage  control  generates  the  sequence  of  pulses  and  gates  that 
operate  the  storage  tubes.  Central  control  instructs  the  storage 
control  either  to  read  or  to  write. 

Arithmetic  control 

Arithmetic  control  carries  out  the  details  of  the  more  complex 
arithmetic  operations  such  as  multiplication  and  division.  The 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


CONTROL 

1  1 

(CENTRAL   CONTROL  I 


PROGRAM  COUNTER 


MASTER 
CLOCK 


INSTRUCTION 


OPERATION 
CONTROL 


STORAGE 
CONTROL 


ARITHMETIC 
CONTROL 


TERMINAL 
EQUIP 
CONTROL 


120  CONTROL  LINES 
 I 


Fig.  3.  Control. 


setup  of  these  operations  plus  the  complete  controlling  of  the 
simpler  operations  such  as  addition  are  carried  out  by  central 
control. 

Terminal  equipment  control 

Terminal  equipment  control  generates  the  necessary  control 
pulses,  delay  times,  and  interlocks  for  the  various  terminal  equip- 
ment units. 

Program  counter 

The  program  counter  which  keeps  track  of  the  address  of  the  next 
order  to  be  carried  out  is  considered  as  part  of  control.  This  is 
an  11 -binary  counter  with  provision  for  reading  to  the  bus. 

Most  of  the  functions  of  these  subsidiary  controls  could  be 
combined  with  the  central  control.  The  major  reason  they  are  not 
is  that  they  were  designed  at  different  times.  The  arithmetic  ele- 
ment and  its  control  came  first,  followed  by  central  control.  At 
the  time  central  control  was  designed,  the  necessary  characteristics 
of  storage  control  were  unknown.  In  fact,  the  machine  was  de- 
signed so  that  any  parallel  high-speed  storage  could  be  used.  The 
form  of  terminal  equipment  control  was  also  unknown  at  this  time. 
Since  flexibility  was  a  prime  specification,  it  was  felt  preferable 
to  build  separate  flexible  controls  for  the  various  parts  of  the 
computer  than  to  try  to  combine  all  the  needed  flexibility  in  one 
central  control. 

In  a  new  machine  we  would  attempt  to  combine  control  func- 
tions where  possible,  hoping  to  have  enough  prior  knowledge 


about  component  needs  to  eliminate  subsidiary  controls  com- 
pletely. We  would  still  insist  on  a  large  degree  of  control  flexibility. 

Master  clock 

The  master  clock  consists  of  an  oscillator,  pulse  shaper  and  divider 
that  generate  I-  and  2-megacycle  clock  pulses,  and  a  clock  pulse 
control  that  distributes  these  clock  pulses  to  the  various  controls 
in  the  machine.  It  is  this  unit  that  determines  which  of  the  sub- 
sidiary controls  actually  is  controlling  the  machine.  This  unit  also 
stops  and  starts  the  machine  and  provides  for  push-button  opera- 
tion. 

Operation  control 

The  operation  control,  see  Fig.  4,  was  designed  for  maximum 
flexibility  and  minimum  number  of  operation  digits,  and,  conse- 
quently, minimum  register  length.  It  is  of  the  completely  decoding 
type. 

The  operation  switch  is  a  .32-position  crystal  matrix  switch  that 
receives  the  .5-bit  instniction  from  the  bus  and  in  turn  selects  one 
of  32  output  lines  corresponding  to  the  32  built-in  operations. 

There  are  120  gate  tubes  on  the  output  of  the  operation  control. 
Pulses  on  the  120  output  lines  go  to  the  gate  drivers,  pulse  drivers, 
and  control  flip-flops  all  over  the  machine;  120  is  a  generous 
number.  The  suppressors  of  these  gate  tubes  are  connected  to 
vertical  wires  that  cross  the  32  output  lines  from  the  operation 
switch.  Crystals  are  inserted  at  the  desired  junctions  to  turn  on 
those  gate  tubes  that  are  to  be  used  for  any  operation. 


3Z-P0SITI0N 


8  POSITION 
TIME-PULSE 
DISTRIBUTOR 


Fig.  4.  Operation  control. 


Chapter  6  |  The  Whirlwind  I  computer  141 


The  time  pulse  distributor  consists  of  an  8-position  switch 
driven  from  a  three  binary-digit  counter.  Clock  pulses  at  the  input 
are  distributed  in  sequence  on  the  eight  output  lines.  The  control 
grids  of  the  output  gate  tubes  are  connected  to  these  timing  lines. 
The  output  of  the  operation  control  is  thus  120  control  lines  on 
each  of  which  can  appear  a  sequence  of  pulses  for  any  combination 
of  orders  at  any  combination  of  times. 

Central  control 

The  Central  Control  of  the  machine  is  shown  in  Fig.  5.  The  control 
switch  is  in  the  foreground  with  the  operation  matrix  to  the  right. 

Electrostatic  storage 

The  electrostatic  storage  shown  in  Fig.  6  consists  of  two  banks 
of  16  storage  tubes  each.  There  is  a  pair  of  .32-position  decoders 


Fig.  5.  View  of  central  control. 


Fig.  6.  View  of  electrostatic  storage. 


set  up  b\'  address  digits  read  in  from  the  bus.  There  is  a  storage 
control  that  generates  the  sequence  of  pulses  needed  to  operate 
the  gate  generators,  et  cetera.  A  radio  frequenc\'  pulser  generates 
a  high  power  10-megac\cle  pulse  for  readout. 

Each  digit  column  contains,  besides  the  storage  tubes,  write 
plus  and  write  minus  gate  generators  and  a  signal  plate  gate 
generator  for  each  tube.  Ten-megacvcle  grid  pulses  are  used  for 
readout  in  order  to  get  the  required  discrimination  between  the 
fractional  volt  readout  pulses  and  the  100-volt  signal  plate  gates. 
For  each  storage  tube  there  is  a  10-megacycle  amplifier,  phase- 
sensitive  detector  and  gate  tube,  feeding  into  the  program  register. 
The  program  register  is  used  for  communicating  with  the  storage 
tubes.  Information  read  out  of  the  tubes  appears  in  the  program 
register.  Information  to  be  written  into  the  tubes  must  be  placed 
in  the  program  register. 


142  Part  2  |  The  instruction-set  processor:  main  line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


MULTIPLICAND      |         A  R 


Fig.  7.  Arithmetic  element. 
Arithmetic  element 

The  arithmetic  element,  see  Fig.  7,  consists  of  three  registers,  a 
counter,  and  a  control. 

The  first  register  is  an  accumulator  (AC)  which  actually  consists 
of  a  partial-sum  or  adding  register  and  a  carry  register.  The  accu- 
mulator holds  the  product  during  multiplication. 

The  second  or  A-register  holds  the  multiplicand  during  multi- 
plication. AW  numbers  entering  the  arithmetic  element  do  so 
through  AR. 

The  third  or  B-register  holds  the  multiplier  during  multiplica- 
tion. The  accumulator  and  B-register  .shift  right  or  left.  A  high-speed 
carry  is  provided  for  addition.  Subtraction  is  by  9's  complement 
and  end-around-carry.  Multiplication  is  by  successive  additions, 
division  by  successive  subtractions,  and  shift  orders  provide  for 
shifting  right  or  left  by  an  arbitrary  number  of  steps,  with  or 
without  roundoff. 

The  arithmetic  element  is  straightforward  except  for  a  few 
special  orders  and  the  high  speed  at  which  it  operates.  Addition 
takes  .3  microseconds  complete  with  carry;  multiplication,  16 
microseconds  average  including  sign  correction. 

In  Fig.  8  are  shown  several  digits  of  the  arithmetic  element. 
The  large  panels  are  accumulator  digits.  Above  the  accumulator 
is  the  B-register,  below  it  the  A-register. 

Test  control 

Test  control,  shown  in  Fig.  9,  is  used  at  present  both  for  operating 
and  for  trouble  shooting  the  computer.  The  control  includes: 


1  Power  supply  control  and  meters. 

2  Neon  indicators  for  all  flip-flops  in  the  machine. 

3  Switches  for  setting  up  special  conditions. 

4  Manual  intervention  switches. 

5  Oscilloscopes  for  viewing  wave  forms.  A  probe  and  amplifier 
system  allows  viewing  any  wave  form  in  the  computer  on 
one  scope  at  test  control. 

6  Test  equipment  to  provide  synchronizing,  stop,  or  delay 
pulses  at  any  step  of  any  order  of  a  program,  allowing 
viewing  wave  forms  on  the  fly  anywhere  in  the  machine. 

An  important  part  of  the  test  facilities  is  the  test  storage,  a 
group  of  32  toggle-switch  registers  plus  five  flip-flop  registers  that 
can  be  inserted  in  place  of  any  five  of  the  toggle-switch  registers. 
This  storage  has  proved  invaluable  not  only  for  testing  control  and 


Fig.  8.  View  of  arithmetic  element. 


Chapter  6  {  The  Whirlwind  I  computer  143 


Fig.  9.  View  ot  test  control. 

arithmetic  element  before  electrostatic  .storage  was  available  but 
also  for  testing  electrostatic  storage  itself.  When  not  in  use  for 
test  purposes  test  storage  earns  its  keep  as  part  of  the  terminal 
equipment  system.  The  toggle-switches  hold  a  standard  read-in 
program;  the  flip-flop  registers  are  used  as  in-out  registers  for 
special  purposes. 

Checking 

Logical  checking  facilities  built  into  WWI  are  rather  inconsistent. 
A  complete  bus  transfer  checking  system  has  been  provided,  dupli- 
cate checking  of  some  terminal  equipment  is  permitted,  but  little 
else  is  thoroughly  checked.  We  felt  that  it  was  worthwhile  to 
thoroughly  check  some  substantial  portion  of  the  machine.  This 
portion  would  then  serve  as  a  prototype  for  studying  the  tube 
circuitry  used  throughout  the  machine.  We  did  not  feel  it  was 
worthwhile  to  check  all  the  machine,  a  procedure  that  requires 
a  great  deal  of  added  equipment  and  logical  complexity  plus  a 
substantial  loss  in  computing  speed. 

Operating  experience  has  shown  us  that  it  is  not  worthwhile 
to  provide  detailed  logical  checking  of  a  machine.  In  a  new 
machine  we  would  leave  out  the  transfer  checking.  The  amount  of 
information  and  security  given  by  the  detailed  checking  system  is 
not  enough  to  warrant  the  expense  of  building  and  maintaining  it. 

This  decision  is  based  on  the  expectation  that  a  computing 
machine  should  operate  95  per  cent  of  total  time  or  better  and 
that  the  average  time  between  random  failures  should  be  of  the 
order  of  5  to  10  hours  or  appro.ximately  10^  operations. 


In  our  opinion  the  way  to  achieve  the  extremely  high  reliability 
needed  in  some  real-time  control  problems  is  to  provide  three  or 
more  identical  but  distinct  machines,  thus  obtaining  error  correc- 
tion as  well  as  detection,  plus  such  features  as  standby,  safety,  and 
damage  control.  Even  so  the  failure  probability  of  each  machine 
must  be  kept  low  by  proper  design,  marginal  checking,  and  pre- 
ventive maintenance. 

Extremely  high  reliability  means  a  reliability  far  bevond  that 
achieved  in  existing  machines  and  not  conveniently  represented 
as  a  per  cent.  Consider  a  system  consisting  of  three  machines,  each 
operable  98  per  cent  of  the  time  and  each  averaging  10  hours 
between  random  errors. 

One  machine  will  be  out  of  operation  V2  hour  per  day. 

Two  machines  will  be  out  of  operation      hour  per  month. 

All  three  machines  will  be  out  of  operation  4  minutes  per  vear. 
Furthermore  undetected  random  errors  might  occur  on  the  aver- 
age of  once  a  year.  Such  reliability  is  needed  in  some  systems. 

Our  decision  to  omit  detailed  checking  does  not  extend  to 
checking  devices  intended  to  detect  programming  errors.  Devices 
to  check  for  overflow  from  the  arithmetic  element  or  for  non- 
existent order  configurations  are  necessary.  Programmers  make 
many  mistakes.  Techniques  for  dealing  with  programming  errors 
are  very  important  and  need  future  development. 

Terminal  equipment 

M  the  present  time.  Whirlwind  is  using  the  following  terminal 
equipment: 

1  A  photoelectric  paper  tape  reader 

2  Mechanical  paper  tape  readers  and  punches 

3  Mechanical  typewriters 

4  Oscilloscope  displays  5  to  16  inches  in  diameter  with  phos- 
phors of  various  persistencies  including  a  computer-con- 
trolled scope  camera 

5  Inputs  from  various  analogue  equipments  needed  for  control 
studies 

6  Outputs  to  analogue  equipment 
To  be  added  during  the  next  year; 

1  Magnetic  Tape  (units  by  Raytheon).  One  such  unit  is  now 
being  integrated  with  machine. 

2  Magnetic  drums  (units  by  Engineering  Research  .Associates, 
Inc.). 

■3    Many  more  analogue  inputs  and  outputs. 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


This  great  complexity  of  terminal  equipment  requires  a  flexible 
switching  system.  There  is  a  single  in-out  register  (lOR)  through 
which  most  of  the  data  passes. 

There  is  a  switch  which  is  set  up  by  an  order  to  select  the 
desired  piece  of  terminal  equipment.  Other  orders  put  data  into 
lOR  or  remove  data  from  lOR.  The  in-out  control  provides  the 
necessary  control  pulses  to  go  with  each  type  of  equipment.  In 


general  the  computer  continues  to  run  during  terminal  equipment 
wait  times;  suitable  interlocks  are  provided  to  prevent  trouble. 
This  complete  equipment  has  not  yet  been  fully  installed. 

References 

Whirhvind:  EverRSl;  SerrR62;  TaylN51. 
EdSAC:  SamuA57;  WilkM.56. 


Chapter  6     The  Whirlwind  I  computer  145 


APPENDIX  1  WHIRLWIND  I  INSTRUCTION  CODE' 


— 

A". 

. 

T.'.r»." 



^" 

"tllZl" 

" 

... 

.... 

CIO., 

c(.o.> 

.... 



Irgm  CM) 

'*  " 

... 

.... 

•"  ' "   -  "••■'^  ~" 

—  2. 

AM.  ..(...  ..... 

m 

• 

.... 

ci.a 

„ .... 

"• 

• 

...... 

I";,'.';; 

ct.O 



—  

— c.».).ci.i  

— j5 — 

TuTT — 

, 

CI.) 

.... 

c, 
(dll.l.  ....I 



, 

./.:,'.>.,., 

T«.«..'-...-.o=.r,«.                 PC  -  ->. 

... 

■.iv,;.'," 

.Cl.MCISAMtf" 

..... 

c, 

..... 

.... 

ii|-c(.M:{*Aiijr'*|-i 



" 

ci.ei.ci.1 

Cl.l 

.... 

a*c).c(.) 

- 

"•.v.-" 

>»lOd 

..... 

|C,.,| 

..... 

.... 

"  ■ 

r,c,.c,«,.,J 

.... 

M., 



:B^'ZZ  

C1.H 

\=..« 

... 

c,«,c..,., 



P-<\ 

..... 

.... 

.'!;! 

|C|.|| 

..... 

l>D)>  > 

.'i-.;r.r.i':„v;  .fz.  "s.v.r.-.ir;-";.;  u-rr - 

...... 

...... 

r."ci;'cr;;,-v;'-,^.'v.",-, 

',-.r='?,'" 

,«...„..„„ 

Note:  In  operations  mr.  mh,  dv,  sir,  srr,  srh,  sf,  the  C(BR)  is  assumed  to  be 
the  magnitude  of  the  least  significant  part  of  AC  +  BR.  For  the  ab  and  dm  oper- 
ations, the  BR  is  treated  just  as  any  storage  register. 

'Whirlwind  I  Instruction  Code  came  from  "Comprehensive  System  Manual,  A 
System  of  Automatic  Coding  for  the  Whirlwind  Computer,"  published  by  Massa- 
chusetts Institute  of  Technology,  Digital  Computer  Laboratory,  Cambridge.  Mass. 


Chapter  7 


Some  aspects  of  the  logical  design  of 
a  control  computer:  a  case  study^ 

R.  L.  Alomo  /  H.  Blair-Smith  /  A.  L.  Hopkins 

Summary  Some  logical  aspects  of  a  digital  computer  for  a  space  vehicle 
are  described,  and  the  evolution  of  its  logical  design  is  traced.  The  intended 
application  and  the  characteristics  of  the  computer's  ancestry  form  a  frame- 
work for  the  design,  which  is  filled  in  by  accumulation  of  the  many  decisions 
made  by  its  designers.  This  paper  deals  with  the  choice  of  word  length, 
number  system,  instruction  set,  memory  addressing,  and  problems  of  multi- 
ple precision  arithmetic. 

The  computer  is  a  parallel,  single  address  machine  with  more  than 
10,000  words  of  16  bits.  Such  a  short  word  length  yields  advantages  of 
efficient  storage  and  speed,  but  at  a  cost  of  logical  complexity  in  connection 
with  addressing,  instruction  selection,  and  multiple-precision  arithmetic. 

1.  Introduction 

In  thi.s  paper  we  attempt  to  record  the  reasoning  that  led  us  to 
certain  choices  in  the  logical  design  of  the  Apollo  Guidance  Com- 
puter (AGC).  The  AGC  is  an  onboard  computer  for  one  of  the 
forthcoming  manned  space  projects,  a  fact  which  is  relevant  pri- 
marily because  it  puts  a  high  premium  on  economy  and  modularity 
of  equipment,  and  results  in  much  specialized  input  and  output 
circuitry.  The  AGC,  however,  was  designed  in  the  tradition  of 
parallel,  single-address  general-purpose  computers,  and  thus  has 
many  properties  familiar  to  computer  designers  [Richards,  1955J, 
[Beckman  et  al.,  1961].  We  will  describe  some  of  the  problems 
of  designing  a  short  word  length  computer,  and  the  way  in  which 
the  word  length  influenced  some  of  its  characteristics.  These 
characteristics  are  number  system,  addressing  system,  order  code, 
and  multiple  precision  arithmetic. 

A  secondary  purpose  for  this  paper  is  to  indicate  the  role  of 
evolution  in  the  AGC's  design.  Several  smaller  computers  with 
about  the  same  structure  had  been  designed  previously.  One  of 
these,  MOD  3C,  was  to  have  been  the  Apollo  Guidance  Computer, 
but  a  decision  to  change  the  means  of  electrical  implementation 
(from  core-transistors  to  integrated  circuits)  afforded  the  logical 
designers  an  unusual  second  chance. 

It  is  our  belief,  as  practitioners  of  logical  design,  that  designers, 
computers  and  their  applications  evolve  in  time;  that  a  frequent 
'/£££  Trims.,  EC-12  (6),  687-697  (December,  196.3) 


reason  for  a  given  choice  is  that  it  is  the  same  as,  or  the  logical 
ne.xt  step  to,  a  choice  that  was  made  once  before. 

A  recent  conference  on  airborne  computers  [Proc.  Conf.  Space- 
borne  Computer  Eng.,  Anaheim,  Calif.,  Oct.  30-31,  1962]  affords 
a  view  of  how  other  designers  treated  two  specific  problems:  word 
length  and  number  system.  All  of  these  computers  have  word 
lengths  of  the  order  of  22  to  28  bits,  and  use  a  two's  complement 
system.  The  AGC  stands  in  contrast  in  these  two  respects,  and 
our  reasons  for  choosing  as  we  did  may  therefore  be  of  interest 
as  a  minoritv  view. 

2.    Description  of  the  AGC 

The  AGC  has  three  principal  sections.  The  first  is  a  memori/,  the 
fixed  (read  only)  portion  of  which  has  24,576  words,  and  the 
erasable  portion  of  which  has  1024  words.  The  next  section  may 
be  called  the  central  .section;  it  includes,  besides  an  adder  and  a 
parity  computing  register,  an  instruction  decoder  (SQ),  a  memory 
address  decoder  (S),  and  a  number  of  addressable  registers  with 
either  special  features  or  special  use.  The  third  section  is  the 
sequence  generator  which  includes  a  portion  for  generating  various 
microprograms  and  a  portion  for  processing  various  interrupting 
requests. 

The  backbone  of  the  AGC  is  the  set  of  16  write  busses;  these 
are  the  means  for  transferring  information  between  the  various 
registers  shown  in  Fig.  1.  The  arrowheads  to  and  from  the  various 
registers  show  the  possible  directions  of  information  flow. 

In  Fig.  1,  the  data  paths  are  shown  as  solid  lines;  the  control 
paths  are  shown  as  broken  lines. 

Memory:  fixed  and  erasable 

The  Fixed  Memory  is  made  of  wired-in  "ropes"  [Alonso  and 
Laning,  1960],  which  are  compact  and  reliable  devices.  The  num- 
ber of  bits  so  wired  is  about  4  x  10^.  The  cycle  time  is  12  lusec. 

The  erasable  memory  is  a  coincident  current  system  with  the 
same  cycle  time  as  the  fixed  memory.  Instructions  can  address 
registers  in  either  memory,  and  can  be  stored  in  either  memory. 


Chapter  7  |  Some  aspects  of  the  logical  design  of  a  control  computer:  a  case  study  147 


PRIORITY  CKT5 

INTERRUPT  PRIORITY 
COUNTER  INCREMENT 
PRIORITY 


OSCILLATOR  \--r-m 
AND 
TIMING  PULSES 


MEMORY  ADDRESS 
REGISTER 


SEQUENCE 
GENERATOR 

INSTRUCTION 
MICROPROGRAM 
PULSES 


_CQNTROLI 
""PULSEV 


MEMORY 
TIMING  — 


MB 

MEMORY  BANK 
REGISTER 


ADDRESSABLE  A 
CENTRAL  0 
REGISTERS  Z 
LP 
IN 
OUT 


ARITHMETIC  UNIT 

ADDER 
PARITY 


50 

INSTRUCTION 
DECODE 


■  -  Control  paths 


MEMORY  LOCAL 
REGISTER 


SPECIAL  UJ 
GATING 


FIXED  MEMORY  p 
ASABLE  MEMORY 


 I 

 Data  paths 


Fig.  1.  AGC  block  diagram. 


The  onlv  logical  difference  between  the  two  memories  is  the 
inability  to  change  the  contents  of  the  fi.xed  part  by  program  steps. 

Each  word  in  memory  is  16  bits  long  (15  data  bits  and  an  odd 
parity  bit).  Data  words  are  stored  as  signed  14  bit  words  using 
a  one  s  complement  convention.  Instniction  words  consist  of  3 
order  code  bits  and  12  address  code  bits. 

The  contents  of  the  address  register  S  uniquely  determine  the 
address  of  the  memory  word  only  if  the  address  lies  between  octal 
0000  and  octal  5777,  inclusiye.  If  the  address  lies  between  octal 
6000  and  octal  7777,  inclusive,  the  address  in  S  is  modified  b\  the 
contents  of  the  memory  bank  register  MB.  The  modification  con- 
sists in  adding  some  integral  multiplies  of  octal  2000  to  the  address 
in  S  before  it  is  interpreted  by  the  decoding  circuitry.  The  meniorv 
bank  register  MB  is  itself  addressable;  its  address,  howeyer,  is  not 
modified  by  its  own  contents. 

Transfers  in  and  out  of  memory  are  made  by  way  of  a  memory 
local  register  G.  For  certain  specific  addresses,  the  word  being 
transferred  into  G  is  not  sent  directly,  but  is  modified  by  a  special 
gating  network.  The  transformations  on  the  word  sent  to  G  are 
right  shift,  left  shift,  right  cycle,  and  left  cycle. 

Central  section 

The  middle  part  of  Fig.  1  shows  the  central  section  in  block  form. 
It  consists  of  the  address  register  S  and  the  memory  bank  register 


MB  both  of  which  were  mentioned  above.  There  is  also  a  block 
of  addressable  registers  called  "central  and  special  registers," 
which  will  be  discussed  later,  an  arithmetic  unit,  and  an  instruc- 
tion decoder  register  SQ. 

The  arithmetic  unit  has  a  parity  generating  register  and  an 
adder.  These  two  registers  are  not  explicitly  addressable. 

The  SQ  register  bears  the  same  relation  to  instructions  as  the 
S  register  bears  to  memory^  locations;  neither  S  nor  SQ  are  e.\- 
plicitly  addressable. 

The  central  and  special  registers  are  A,  Q,  Z,  LP.  and  a  set  of 
input  and  output  registers.  Their  properties  are  shown  in  Table  I. 

Sequence  generator 

The  sequence  generator  provides  the  basic  memory  timing,  the 
sequences  of  control  pulses  (microprograms)  which  constitute  an 
instruction,  the  priority  interrupt  circuitry,  and  a  niunber  of  scal- 
ing networks  which  provide  various  pulse  frequencies  used  h\  the 
computer  and  the  rest  of  the  navigation  system. 

Instructions  are  arranged  so  as  to  last  an  integral  number  of 
memory  cycles.  The  list  of  11  instructions  is  treated  in  detail  in 
Sec.  6.  In  addition  to  these  there  are  a  number  of  "involuntary" 
sequences,  not  under  normal  program  control,  which  mav  break 
into  the  normal  sequence  of  instructions;  these  are  triggered  either 
by  external  events,  or  by  certain  overflows  within  the  AGC,  and 


148  Part  2  I  The  instruction-set  processor:  main  line  computers  Section  1     Processors  with  one  address  per  instruction 


Table  1 

Special  and  central  registers 

Register 

Octal 
(s)  address 

Purpose  and/or  properties 

A 

0000 

Central  accumulator.  Most  Instructions  refer 
to  A. 

Q 

0001 

If  a  transfer  of  control  (TC)  occurred  at  L, 
(C)  =  L  +  1. 

Z 

0002 

Program  counter.  Contains  L  +  1,  wfiere  L 
IS  the  address  of  the  instruction  presently 
being  executed. 

LP 

0003 

Low  product  register.  This  register  modifies 
words  written  into  it  by  shifting  them  in  a 
special  way. 

IN 

Several  registers  which  are  used  for  sampling 
either  external  lines,  or  internal  computer 
conditions  such  as  time  or  alarms. 

OUT 

Several  output  registers  whose  bits  control 
switches,  networks,  and  displays. 

may  be  divided  into  two  categories:  counter  incrementing  and 
program  interruption. 

Counter  incrementing  may  take  place  between  any  two  mem- 
ory cycles.  External  requests  for  incrementing  a  counter  are  stored 
in  a  counter  priority  circuit.  At  the  end  of  every  memory  cycle 
a  test  is  made  to  see  if  any  incrementing  requests  exist.  If  not, 
the  next  normal  memory  cycle  is  executed  directly,  with  no  time 
between  cycles.  If  a  request  is  present,  an  incrementing  memory 
cycle  is  executed.  Each  "counter"  is  a  specific  location  in  erasable 
memory.  The  incrementing  cycle  consists  of  reading  out  the  word 
stored  in  the  counter  register,  incrementing  it  (positively  or  nega- 
tively), or  shifting  it,  and  storing  the  results  back  in  the  register 
of  origin.  All  outstanding  counter  incrementing  requests  are  proc- 
essed before  proceeding  to  the  next  normal  memory  cycle.  This 
type  of  interrupt  provides  for  asynchronous  incremental  or  serial 
entry  of  information  into  the  working  erasable  memory.  The  pro- 
gram steps  may  refer  directly  to  a  "counter"  to  obtain  the  desired 
information  and  do  not  have  to  refer  to  input  buffers.  Overflows 
from  one  counter  may  be  used  as  the  input  to  another.  A  further 
property  of  this  system  is  that  the  time  available  for  normal  pro- 
gram steps  is  reduced  linearly  by  the  amount  of  counter  activity 
present  at  any  given  time. 

Program  interruption  occurs  between  normal  program  steps 


rather  than  between  memory  cycles.  An  interruption  consists  of 
storing  the  contents  of  the  program  counter  and  transferring  con- 
trol to  a  fi.\ed  location.  Each  interrupt  line  has  a  different  location 
associated  with  it.  Interrupting  programs  may  not  be  interrupted, 
but  internipt  requests  are  not  lost,  and  are  processed  as  soon  as 
the  earlier  interrupted  program  is  resumed.  Calling  the  resume 
sequence,  which  restores  the  program  counter,  is  initiated  by 
referencing  a  special  address. 

3.    Word  length 

In  an  airborne  computer,  granted  the  initial  choice  of  parallel 
transfer  of  words  within  it,  it  is  highly  desirable  to  minimize  the 
word  length.  This  is  because  memory  sense  amplifiers,  being  high- 
gain  class  A  amplifiers,  are  considerably  harder  to  operate  with 
wide  margins  (of  temperature,  voltages,  input  signal)  than,  say, 
the  circuits  made  up  of  NOR  gates.  It  is  best  to  have  as  few  of 
these  as  possible.  Furthermore,  the  number  of  ferrite-plane  inhibit 
drivers  ecjuals  the  number  of  bits  in  a  word  in  this  case.  Similarlv, 
the  time  required  for  a  carry  to  propagate  in  a  parallel  adder  is 
proportional  to  the  word  length,  and  in  the  present  case,  this  factor 
could  be  expected  to  affect  the  microprogramming  of  instructions. 
The  initial  intent,  then,  was  to  have  as  short  a  word  length  as 
possible. 

Another  initial  choice  is  that  the  AGC  .should  be  a  "common 
storage"  machine,  which  means  that  instructions  may  be  executed 
from  erasable  memory  as  well  as  from  fixed  memory,  and  that  data 
(obviously  constants,  in  the  case  of  fixed  memory)  may  be  stored 
in  either  memory.  This  in  turn  means  that  the  word  sizes  of  both 
types  of  memory  must  be  compatible  in  some  sense;  for  the  AGC, 
the  easiest  form  of  compatibility  is  to  have  equal  word  lengths. 
So-called  "separate  storage"  solutions  which  allow  different  word 
lengths  for  instnictions  and  data  can  be  made  to  work  [Walend- 
ziewicz,  1962]  but  they  have  a  drawback  in  that  three  memories 
are  then  required:  a  data  memory  (erasable),  and  two  fixed  memo- 
ries, one  for  instructions  and  one  for  constants.  In  addition,  we 
have  found  that  separate  storage  machines  are  more  awkward  to 
program,  and  use  memory  less  efficiently,  than  common  storage 
machines. 

There  are  three  principal  factors  in  the  choice  of  word  length. 
These  are: 

1  Precision  desired  in  the  representation  of  navigational  vari- 
ables. 

2  Range  of  the  input  variables  which  are  entered  serially  and 
counted. 


Chapter  7  |  Some  aspects  of  the  logical  design  of  a  control  computer:  a  case  study  149 


3    Instruction  word  format.  Division  of  instruction  words  into 
two  fields,  one  for  operation  code  and  one  for  address. 

As  a  start,  the  cfioice  of  word  length  (15  bits)  for  two  previous 
machines  in  this  series  was  kept  in  mind  as  a  satisfactory  word 
length  from  the  point  of  view  of  mechanization;  i.e.,  the  number 
of  sense  amplifiers,  inhibit  drivers,  the  carry  propagation  time,  etc., 
were  all  considered  satisfactory.  The  act  of  "choosing"  word  length 
really  meant  whether  or  not  to  alter  the  word  length,  at  the  time 
of  change  from  MOD  .3C  to  the  AGC,  and  in  particular  whether 
to  increase  it.  The  influence  of  the  three  principal  factors  will  be 
taken  up  in  turn. 

Precision  of  data  words 

The  data  words  used  in  the  .^CC  mav  be  divided  roughh  into 
two  classes:  data  words  used  in  elaborate  navigational  computa- 
tions, and  data  words  used  in  the  control  of  various  appliances 
in  the  system.  Initial  estimates  of  the  precision  required  bv  the 
first  class  ranged  from  27  to  32  bits,  0(10*'-').  The  second  class 
of  variables  could  almost  always  be  represented  with  15  bits.  The 
fact  that  navigational  variables  require  about  twice  the  desired 
15-bit  word  length  means  that  there  is  not  much  advantage  to 
word  sizes  between  15  and  28  bits,  as  far  as  precision  of  represen- 
tation of  variables  is  concerned,  because  double-precision  mnnbers 
must  be  used  in  anv  event.  Because  of  the  doublv  signed  number 
representation  for  double-precision  words,  the  equivalent  word 
length  is  29  bits  (including  sign),  rather  than  .30,  for  a  basic  word 
length  of  15  bits. 

The  initial  estimates  for  the  proportion  of  15-bit  vs  29-bit 
quantities  to  be  stored  in  both  fixed  and  erasable  memories  indi- 
cated the  overwhelming  preponderance  of  the  former.  It  was  also 
estimated  that  a  significant  portion  of  the  computing  had  to  do 
with  control,  telemetry  and  displav  activities,  all  of  "which  can  be 
handled  more  economically  with  short  words.  A  short  word  length 
allows  faster  and  more  efficient  use  of  erasable  storage  because 
it  reduces  fractional  word  operations,  such  as  packing  and  editing; 
it  also  means  a  more  efficient  encoding  of  small  integers. 

Range  of  input  variables 

As  a  control  computer,  the  AGC  must  make  analog-to-digital 
conversions,  many  of  which  are  of  shaft  angles.  Two  principal 
forms  of  conversion  exist:  one  renders  a  whole  number,  the  other 
produces  a  train  of  pulses  which  must  be  counted  to  yield  the 
desired  number.  The  latter  type  of  conversion  is  employed  by  the 
AGC,  using  the  counter  incrementing  feature. 

When  the  number  of  bits  of  precision  required  is  greater  than 
the  computer's  word  length,  the  effective  length  of  the  counter 


must  be  extended  into  a  second  register,  either  bv  programmed 
scanning  of  the  counter  register,  or  bv  using  a  second  counter 
register  to  receive  the  overflows  of  the  first.  Whether  programmed 
scanning  is  feasible  depends  largely  on  how  frequently  this  scan- 
ning must  be  done.  The  cost  of  using  an  extra  counter  register 
is  directly  measured  in  terms  of  the  priority  circuit  associated 
with  it. 

In  the  AGC,  the  equipment  saved  bv  reducing  the  word  length 
below  15  bits  would  probablv  not  match  the  additional  expense 
incurred  in  double-precision  extension  of  many  input  variables. 
The  question  is  academic,  however,  since  a  lower  bound  on  the 
word  length  is  effectively  placed  by  the  format  of  the  instruction 
word. 

Instruction  word  format 

\n  initial  decision  was  made  that  instructions  would  consist  of 
an  operation  code  and  a  single  address.  The  straightforward 
choices  of  packing  one  or  two  such  instructions  per  word  were 
the  only  ones  seriously  considered,  although  other  schemes,  such 
as  packing  one  and  a  half  instructions  per  word,  are  possible 
[England,  1962].  The  previous  computers  MOD  3S  and  MOD  .3C 
had  a  3-bit  field  for  operation  codes  and  a  12-bit  field  for  addresses, 
to  accommodate  their  8  instruction  order  codes  and  4096  words 
of  memory.  In  the  initial  core-transistor  version  of  the  AGC  (i.e., 
MOD  .3C),  the  8  instruction  order  codes  were  in  realitv  augmented 
by  the  various  special  registers  provided,  such  as  shift  right,  cycle 
left,  edit,  so  that  a  transfer  in  and  out  of  one  of  these  registers 
would  accomplish  actions  normally  specified  by  the  order  code 
(see  Sec.  6).  The.se  registers  were  considered  to  be  more  economical 
than  the  corresponding  instruction  decoding  and  control  pulse 
sequence  generation.  Hence  the  3  bits  assigned  to  the  order  code 
were  considered  adequate,  albeit  not  generous.  Furthermore,  as 
will  be  seen,  it  is  possible  to  use  an  indexing  instruction  so  as  to 
increase  to  eleven  the  number  of  explicit  order  codes  provided 
for. 

The  address  field  of  12  bits  presented  a  different  problem.  .\t 
the  time  of  the  design  of  MOD  3C  we  estimated  that  4000  words 
w  ould  satisfy  the  storage  requirements.  By  the  time  of  redesign 
it  was  clear  that  the  requirement  was  for  10^  words,  or  more,  and 
the  question  then  became  whether  the  proposed  extension  of  the 
address  field  by  a  bank  register  (see  Sec.  7)  was  more  economical 
than  the  addition  of  2  bits  to  the  word  length.  For  reasons  of 
modularity  of  equipment,  adding  2  more  bits  to  the  word  length 
would  result  in  adding  2  more  bits  to  all  the  central  and  special 
registers,  which  amounts  to  increasing  the  size  of  the  nonmemorv 
portion  of  the  AGC  by  10  per  cent. 


150  Part  2  I  The  instruction-set  processor:  main-line  computers  Section  1  j  Processors  with  one  address  per  instruction 


In  summary,  the  15-bit  word  length  seemed  practical  enough 
so  that  the  additional  cost  of  extra  bits  in  terms  of  size,  weight, 
and  reliability  did  not  seem  warranted.  A  14-bit  word  length  was 
thought  impractical  because  of  the  problems  with  certain  input 
variables,  and  it  would  further  restrict  the  already  somewhat 
cramped  instruction  word  format.  Word  lengths  of  17  or  18  bits 
would  result  in  certain  conceptual  simplicities  in  the  decoding 
of  instructions  and  addresses,  but  would  not  help  in  the  represen- 
tation of  navigational  variables.  These  require  28  bits,  and  so  they 
must  be  represented  to  double  precision  in  any  event. 


4.    Number  representation 

Signed  numbers 

In  the  absence  of  the  need  to  represent  numbers  of  both  signs, 
the  discussion  of  number  representation  would  not  extend  beyond 
the  fact  that  numbers  in  AGC  are  expressed  to  base  two.  But  the 
accommodation  of  both  positive  and  negative  numbers  requires 
that  the  logical  designer  choose  among  at  least  three  possible  forms 
of  binary  arithmetic.  These  three  principal  alternatives  are:  (1) 
one's  complement,  (2)  two's  complement,  and  (.3)  sign  and  magni- 
tude [Richards,  1955]. 

In  one's  complement  arithmetic,  the  sign  of  a  number  is  re- 
versed by  complementing  every  digit,  and  "end  around  carry"  is 
required  in  addition  of  two  numbers. 

In  two's  complement  arithmetic,  sign  reversal  is  effected  by 
complementing  each  bit  and  adding  a  low  order  one,  or  some 
equivalent  operation. 

Sign  and  magnitude  representation  is  tvpicallv  used  where 
direct  human  interrogation  of  memory  is  desired,  as  in  "post- 
mortem" meniorv  dumps,  for  example.  The  addition  of  numbers 
of  opposite  sign  requires  either  one's  or  two's  complementation 
or  comparison  of  magnitude,  and  sometimes  may  use  both.  No 
advantage  is  offered  in  efficiency  with  the  possible  exception  of 
sign  changing,  which  only  requires  changing  the  sign  bit.  A  disad- 
vantage is  engendered  in  magnetic  core  logic  machines  by  the 
extra  equipment  needed  for  subtraction  or  conditional  recomple- 
mentation. 

The  one's  complement  notation  has  the  advantage  of  having 
easy  sign  reversal,  which  is  equivalent  to  Boolean  complementa- 
tion; hence  a  single  machine  instruction  performs  both  fimctions. 
Zero  is  ambiguously  represented  by  all  zero's  and  by  all  one's, 
so  that  the  number  of  numerical  states  in  an  n-bit  word  is  2"  —  1. 

Two's  complement  arithmetic  is  advantageous  where  end 
around  carry  is  difficult  to  mechanize,  as  is  particularly  true  in 
serial  computers.  An  n-bit  word  has  2"  states,  which  is  desirable 


for  input  conversions  from  such  devices  as  pattern  generators, 
geared  encoders,  or  binary  scalers.  Sign  reversal  is  awkward,  how- 
ever, since  a  full  addition  is  required  in  the  process. 

The  choice  in  the  case  of  the  AGC  was  to  use  one's  complement 
arithmetic  in  general  processing,  and  two's  complements  for  cer- 
tain input  angle  conversions.  Since  the  only  arithmetic  done  in 
the  latter  case  is  the  addition  of  plus  or  minus  one,  the  two's 
complement  facility  is  provided  simply  by  suppressing  end  around 
carry  and  using  the  proper  representation  of  minus  one.  The  latter 
is  stored  as  a  fixed  constant,  so  that  no  sign  reversal  is  required. 

Modified  one's  complement  system 

In  a  standard  one's  complement  adder,  overflow  is  detected  by 
examining  carries  into  and  out  of  the  sign  position.  These  overflow 
indications  must  be  "caught  on  the  fly  "  and  stored  separately  if 
they  are  to  be  acted  upon  later.  The  number  system  adopted  in 
the  AGC  has  the  advantage  of  being  a  one's  complement  system 
with  the  additional  feature  of  having  a  static  indication  of  over- 
flow. The  implementation  of  the  method  depends  on  the  AGC's 
not  using  a  parity  bit  in  most  central  registers.  Because  of  certain 
modular  advantages,  16,  rather  than  15,  columns  are  available  in 
all  of  the  central  registers,  including  the  adder.  Where  the  parity 
bit  is  not  required,  the  extra  bit  position  is  used  as  an  extra  column. 
The  virtue  of  the  16-bit  adder  is  that  the  overflow  of  a  15-bit  sum 
is  readily  detectable  upon  examination  of  the  two  high  order  bits 
of  the  sum  (see  Fig.  2).  If  both  of  these  bits  are  the  same,  there 
is  no  overflow.  If  they  are  different,  overflow  has  occurred  with 
the  sign  of  the  highest  order  bit. 

The  interface  between  the  16-bit  adder  and  the  15-bit  memory 
is  arranged  so  that  the  sign  bit  of  a  word  coming  from  memory 
enters  both  of  the  two  high  order  adder  columns.  These  are  de- 
noted Sj  and  Sj  since  they  both  have  the  significance  of  sign  bits. 
When  a  word  is  transferred  from  the  accumulator  A  to  memory, 
only  one  of  these  two  signs  can  be  stored.  Our  choice  was  to  store 
the  Sj  bit,  which  is  the  standard  one's  complement  sign  except 
in  the  event  of  overflow,  in  which  case  it  is  the  sign  of  the  two 
operands.  This  preservation  of  sign  on  overflow  is  an  important 
asset  in  dealing  with  carries  between  component  words  of  multi 
pie-precision  numbers  (see  Sec.  5). 

In  a  standard  one's  complement  system,  a  series  of  additions 
may  result  in  subtotals  which  overflow,  yet  still  produce  a  valid 
sum  so  long  as  the  total  does  not  exceed  the  capacity  of  one  word. 
In  a  modified  one's  complement  system,  however,  where  sign  is 
preserved  on  overflow,  this  is  no  longer  true;  and  the  total  may 
depend  on  the  order  in  which  the  numbers  are  added;  this  is  not 
a  serious  drawback,  but  it  must  be  accounted  for  in  all  phases 
of  logical  design  and  programming. 


Chapter  7  |  Some  aspects  of  the  logical  design  of  a  control  computer:  a  case  study  151 


STANDARD 

MODIFIED 

Si 

4 

3 

2 

1 

S2 

Si 

4 

3 

2 

1 

EXAMPLE  1: 

Both  operands  positive;  Sum  positive,  no  overflow.  Identical  results 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

in  both  systems. 

0 

0 

0 

1 

1 

0 

0 

0 

0 

1 

1 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

EXAMPLE  2: 

Both  operands  positive:  positive  overflow.  Standard  result  is  nega- 

0 

1 

0 

0 

1 

0 

0 

1 

0 

0 

1 

tive:  Modified  result  is  positive  using  So  as  sign  of  the  answer. 

0 

1 

0 

1 

1 

0 

0 

1 

0 

1 

1 

Positive  overflow  indicated  by  Si  *  S2. 

1 

0 

1 

0 

0 

0 

1 

0 

1 

0 

0 

EXAMPLE  3: 

Both  operands  negative:  Sum  negative,  no  overflow.  End  around 

1 

1 

1 

1 

0 

1 

1 

1 

1 

0 

carry  occurs.  Identical  results  in  both  systems  using  either  Si  or  S2 

1 

1 

1 

0 

0 

1 

1 

1 

0 

0 

as  the  sign  of  the  answer. 

1 

1 

0 

1 

0 

1 

1 

0 

1 

0 

1 

carry 

1 

carry 

1 

1 

0 

1 

1 

1 

1 

0 

1 

1 

EXAMPLE  4: 

Both  operands  negative;  negative  overflow.  Standard  result  is  posi- 

1 

0 

1 

1 

0 

1 

0 

1 

1 

0 

tive:  modified  result  is  negative  using  S2  as  the  sign  of  the  answer. 

1 

0 

1 

0 

0 

1 

0 

1 

0 

0 

Negative  overflow  indicated  by  Si  ■  S2. 

0 

1 

0 

1 

0 

0 

1 

0 

1 

0 

1 

carry 

1 

carry 

0 

1 

0 

1 

1 

0 

1 

0 

1 

1 

EXAMPLE  5: 

Operands  have  opposite  sign;  Sum  positive.  Identical  results  i;'  both 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

0 

systems. 

0 

0 

0 

1 

1 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

1 

carry 

1 

carry 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 

EXAMPLE  6: 

Operands  have  opposite  sign;  sum  negative.  Identical  results  In 

1 

1 

1 

0 

0 

1 

1 

1 

1 

0 

0 

both  systems. 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

0 

1 

Fig.  2.  Illustrative  example  of  properties  of  modified  one's  complement  system. 


5.    Multiple  precision  arithmetic 

.\  short  word  computer  can  be  effective  only  if  the  multiple- 
precision  routines  are  efficient  corresponding  to  their  share  of  the 
computer's  word  load.  In  the  AGC's  application  there  is  enough 
use  for  multiple-precision  arithmetic  to  warrant  consideration  in 
the  choice  of  number  system  and  in  the  organization  of  the  mstnic- 
tion  set.  Although  the  limited  number  of  order  codes  prohibits 
multiple-precision  instructions,  special  features  are  associated  with 
the  conventional  instructions  to  expedite  multiple-precision  opera- 
tions. 

Independent  sign  representation 

A  variety  of  formats  for  multiple-precision  representation  are 
possible;  probably  the  most  common  of  these  is  the  identical  sign 


representation  in  which  the  sign  bits  of  all  component  words  agree. 
The  method  used  in  the  AGC  allows  the  signs  of  the  components 
to  be  different. 

Independent  signs  arise  naturally  in  multiple-precision  addition 
and  subtraction,  and  the  identical  sign  representation  is  costly 
because  sign  reconciliation  is  required  after  every  operation.  For 
example.  ( -1-6,  -|-4)  -|-  (  —  4,  —6)  =  (+2,  —2),  a  mi.xed  sign  repre- 
sentation of  ( -(- 1,  -1-8).  Since  addition  and  subtraction  are  the  most 
frequent  operations,  it  is  economical  to  store  the  result  as  it  occurs 
and  reconcile  signs  only  when  necessar\'.  When  overflo\\'  occurs 
in  the  addition  of  two  components,  a  one  with  the  sign  of  the 
overflow  is  carried  to  the  addition  of  the  next  higher  components. 
The  sum  that  overflowed  retains  the  sign  of  its  operands.  This 
overflow  is  termed  an  interflow  to  distinguish  it  from  an  overflow 


152  Part  2  I  The  instruction-set  processor:  main-line  computers  Section  1     Processors  with  one  address  per  instruction 


that  arises  when  the  maximum  multiple-precision  number  is  ex- 
ceeded. 

The  independent  sign  method  has  a  pitfall  arising  from  the  fact 
that  ever}'  number  has  two  representations,  either  one  of  which 
may  occur  as  a  sum.  There  are  some  numbers  for  which  one  of 
the  representations  exceeds  the  capacity  of  the  most  significant 
component.  The  overflow  is  false  in  the  sense  that  the  double- 
precision  capacity  is  not  exceeded,  only  the  single  word  capacity 
of  the  upper  component.  Sign  reconciliation  can  be  used  in  this 
case  to  yield  an  acceptable  representation.  This  problem  can  be 
avoided  if  all  numbers  are  scaled  so  that  none  are  large  enough 
to  produce  false  overflows.  Such  a  restriction  is  not  necessary, 
however,  since  the  false  overflow  condition  arises  infrequently  and 
can  be  detected  at  no  expense  in  time.  The  net  cost  of  reconcilia- 
tion is  therefore  very  low. 

Multiplication  and  division 

For  triple  and  higher  orders  of  precision,  multiplication  and  divi- 
sion become  excessively  complex,  unlike  addition  and  subtraction 
where  the  complexity  is  only  linear  with  the  order  of  precision. 

The  algorithm  for  double-precision  multiplication  is  directly 
applicable  to  numbers  in  the  independent  sign  notation.  False 
overflow  does  not  arise,  and  the  treatment  of  interflow  is  simplified 
by  an  automatic  counter  register  which  is  incremented  when 
overflow  occurs  during  an  add  instruction.  The  sign  of  the  counter 
increment  is  the  same  as  the  sign  of  the  overflow;  and  the  incre- 
ment takes  place  while  one  of  the  product  components  of  next 
higher  order  is  stored  in  that  counter. 

Double-precision  division  is  exceptional  in  that  the  independ- 
ent sign  notation  may  not  be  used;  both  operands  must  be  made 
positive  in  identical  sign  form,  and  the  divisor  normalized  so  that 
the  left-most  nonsign  bit  is  one. 

Triple  precision 

A  few  triple-precision  quantities  are  used  in  the  AGC.  These  are 
added  and  subtracted  using  independent  sign  notation  with  inter- 
flow and  overflow  features  the  same  as  those  used  for  double- 
precision  arithmetic. 

6.    Instruction  set 

Basic  design  criteria 

The  implicit  requirements  for  any  von  Neumann-type  machine 
demand  that  facilities  exist  for: 

1    Fetching  from  memory 


2  Storing  in  memory 

3  Negating  (complementing) 

4  Combining  two  operands  (e.g.,  addition) 

5  Address  modification  (more  generally,  executing  as  an  in- 
struction the  result  of  arithmetic  processing) 

6  Normal  sequencing  (to  each  location  from  which  an  instruc- 
tion can  be  executed  there  corresponds  one  location  whose 
contents  are  the  next  instruction) 

7  Conditional  secjuence  changing,  or  transfer  of  control 

8  Input 

9  Output 

An  instruction  can,  of  course,  provide  several  of  these  facilities. 
For  instance,  some  computers  have  an  instruction  that  subtracts 
the  contents  of  a  memory  location  from  an  accumulator  and  leaves 
the  result  in  that  memory  location  and  in  the  accumulator;  this 
instruction  fulfills  all  of  requirements  1-4  above.  Requirement  5 
is  met  in  a  somewhat  primitive  manner  if  instructions  can  be 
executed  from  erasable  memory,  and  is  met  elegantly  by  the  use 
of  index  registers.  Still  another  scheme,  somewhat  similar  to  one 
used  in  the  Bendix  C-20,  is  employed  in  the  AGC.  Requirement 
6  is  usually  fulfilled  by  having  an  instruction  location  counter 
which  contains  the  address  of  the  next  instruction  to  be  executed, 
and  is  incremented  by  one  when  an  instniction  is  fetched.  Alter- 
natively, each  instruction  may  include  the  address  of  the  next 
instruction,  as  is  often  done  in  machines  having  drum  memories. 
In  the  AGC,  as  in  most  short-word  computers,  the  former  method, 
with  one  single-address  instruction  per  word,  is  clearly  the  simplest 
and  cheapest.  Requirement  7  is  generally  met  by  examining  a 
condition  such  as  the  sign  of  an  accumulator  and,  if  the  condition 
is  satisfied,  either  incrementing  the  instruction  location  counter 
(skipping),  or  using  an  address  included  in  the  instruction  as  that 
of  the  next  instruction  (conditional  transfer  of  control).  An  uncon- 
ditional transfer  of  control  is  usual  but  not  necessary,  since  any 
desired  condition  can  be  forced.  Most  machines  have  special 
input-output  instnictions  to  satisfy  requirements  8  and  9.  In  the 
AGC,  however,  since  input  and  output  is  through  addressable 
registers,  input  is  subsumed  under  fetching  from  memory,  and 
output  imder  storing  in  memory.  Counter  incrementing  and  pro- 
gram interruption  aid  these  fimctions  also. 

Further  criteria 

The  major  goals  in  the  AGC  were  efficient  use  of  memory,  reason- 
able speed  of  computing,  potential  for  elegant  programming,  effi- 


Chapter  7  j  Some  aspects  of  the  logical  design  of  a  control  computer:  a  case  study  153 


cient  multiple  precision  arithmetic,  efficient  processing  of  input 
and  output,  and  reasonable  simplicity  of  the  sequence  generator. 
The  constraints  affecting  the  order  code  as  a  whole  were  the  word 
length,  one's  complement  notation,  parallel  data  transfer,  and  the 
characteristics  of  the  editing  registers.  The  ground  rules  governing 
the  choice  of  instructions  arose  from  these  goals  and  constraints. 

a    Three  bits  of  an  instruction  word  are  devoted  to  operation 
code. 

b    Address  modification  must  be  convenient  and  efficient. 

c    There  should  he  a  multiply  instruction  yielding  a  double 
length  product. 

d    Treatment  of  overflow  on  addition  must  be  flexible. 

e    A  Boolean  combinatorial  operation  should  be  available. 

f   No  instruction  need  be  devoted  to  input,  output,  or  shifting. 

This  list  is  by  no  means  complete,  but  gives  a  good  indication  of 
what  kind  of  computer  the  AGC  has  to  be.  In  the  following  para- 
graphs the  ways  in  which  the  instnictions  fulfill  the  above  require- 
ments are  described. 

Details  of  the  instruction  set 

In  the  listing  that  follows,  L  denotes  the  location  of  the  instruction; 
K  denotes  the  data  address  contained  in  the  instruction.  Paren- 
theses mean  "content  of,"  and  the  leftward  arrow  means  that  the 
register  named  at  the  arrowhead  is  set  to  the  quantity  named  to 
the  right. 

L:  TC  K:  Transfer  Control 

Q<^L  +  1;  go  to  K. 

This  is  the  primary  method  of  transferring  control  to  any  stated 
location,  and  thus  meets  part  of  requirement  7.  The  setting  of  the 
return  address  register  Q  renders  complex  subroutines  feasible.  TC 
Q  may  be  used  to  return  from  a  subroutine  {with  no  other  TC's) 
because  the  binary  number  "L  +  1  "  is  the  same  as  the  binary  word 
"TC  L  -I-  I."  by  virtue  of  the  TC  code  being  all  zeros.  TC  A 
behaves  like  an  "execute"  instruction,  executing  whatever  instruc- 
tion is  in  A,  because  Q  follows  A  in  the  address  pattern,  see 
Table  1. 

L:  CCS  K;  Count,  Compare,  and  Skip 

If  (K)  >  -1-0,  A  ^(K)  -1,  no  skip;  if  (K)  =  -1-0,  A  ^  -1-0,  skip 
to  L  -I-  2;  if  (K)  <  -0,  A  ^  1  -  (K),  skip  to  L  -f  3;  if  (K)  = 
-0,  A  <-  +0,  skip  to  L  -H  4. 

This  instruction  fulfills  the  remainder  of  requirement  7  and 
provides  several  features.  It  is  clear  that  in  a  machine  with  a  3-bit 


operation  code  there  should  be  only  one  code  devoted  entirely  to 
branching,  if  at  all  possible.  It  is  inefficient  to  program  a  zero  test 
using  only  a  sign-testing  code;  it  is  even  more  inefficient  to  pro- 
gram a  sign  test  using  only  a  zero-testing  code.  This  instruction 
was  therefore  designed  to  test  both  types  of  conditions  simultane- 
ously. It  has  to  be  a  four-way  branch,  and  since  there  is  only  one 
address  per  instmction,  it  follows  that  CCS  must  be  a  skipping- 
type  branch. 

The  function  of  (K)  delivered  to  A  is  the  diminished  absolute 
value  (D.ABS).  It  serves  two  primary  purposes:  to  do  most  of  the 
work  in  generating  an  absolute  value,  and  to  apply  a  negative 
increment  to  the  contents  of  a  loop-counting  register,  so  that  CCS 
has  some  of  the  properties  of  TLX  in  the  IBM  704. 

L;  INDEX  K;  Index  using  K 

Use  (L  -I-  1)  4-  (K)  as  the  next  instruction. 

In  a  short -word  machine  where  there  is  no  room  in  the  instruc- 
tion word  to  specify  indexing  or  indirect  addressing,  this  code 
meets  requirement  5  in  a  w  av  far  superior  to  forming  an  instruction 
and  placing  it  in  A  or  in  erasable  memory  for  execution.  INDEX 
operates  on  whole  words,  so  that  the  operation  code  as  well  as 
the  address  may  be  modified.  It  may  be  used  recursively  (consider 
the  implications  of  several  INDEX'S  in  succession,  assuming  that 
no  operation  codes  are  modified).  Finally,  it  permits  more  than 
8  operation  codes  to  be  specified  in  3  bits,  since  overflow  of  the 
indexing  addition  is  detectable. 

L;  XCH  K;  Exchange 

(A)^(K). 

This  instruction  meets  requirements  1,  2,  and  8.  When  K  is 
in  fixed  memory,  it  is  simply  a  data-fetching  (clear  and  add)  code. 
Its  use  with  erasable  memory  aids  efficiency  by  reducing  the  need 
for  temporary  storage.  XCH  is  also  an  important  input  instruction 
in  a  machine  where  addressable  counters,  incremented  in  response 
to  external  events,  are  an  input  medium,  because  a  counter  can 
be  read  out  and  reset  (to  zero  or  any  desired  value)  by  XCH  with 
no  chance  of  missing  a  count. 

L;  CS  K;  Clear  and  Subtract 

A  ^  -{K). 

CS  is  the  primary  means  of  sign-changing  and  logical  negation, 
and  so  fulfills  requirements  1  and  3.  Since  there  is  no  clear  and 
add  instmction.  it  is  the  usual  operation  for  nondestructive  readout 
of  erasable  memory  in  simple  data  transfers,  that  is.  when  no 
addition  or  other  arithmetic  is  required.  Usually  the  programming 
can  be  arranged  so  that  complementing  during  transfer  is  accept- 
able; otherwise  the  CS  can  be  followed  by  CS  A  before  storing. 

L:  TS  K;  Transfer  to  Storage 

K  <— (A);  if  (A)  includes  ±  overflow,  A  «—  ±1,  skip  to  L  +  2. 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1     Processors  with  one  address  per  instruction 


This  instruction  is  the  primary  means  of  transfers  to  memory 
and  output,  satisfying  requirements  2  and  9.  It  is  also  the  most 
convenient  method  of  testing  for  overflow.  Since  A  and  the  other 
central  registers  have  two  sign  positions,  overflow  indication  is 
retained  in  a  central  register.  TS  always  stores  (A)  and  tests 
whether  overflow  is  present.  If  K  is  in  erasable  memory  and  is 
not  a  central  register,  the  lower-order  sign  bit  Sj  is  not  transmitted; 
this  is  the  process  or  overflow  correction.  If  positive  overflow 
indication  is  present  in  A,  TS  skips  over  the  next  instruction  and 

sets  A  «  1- 1  (+1  denotes  octal  000001);  if  negative  overflow  is 

present,  TS  skips  over  the  next  instniction  and  sets  A  *  1  (  —  1 

denotes  octal  177776);  otherwise  (A)  are  unchanged.  The  sequence 

TS  K 

XCH    ZERO    (ZERO  in  fi.xed  memory) 

suffices  to  store  in  K  an  overflow-corrected  word  of  a  multiple- 
precision  sum  and  leave  in  A  the  interflow  to  the  next  higher-order 
part.  TS  A  skips  if  either  type  of  overflow  is  present,  but  leaves 
all  16  bits  of  (A)  unchanged. 

Finally,  a  computed  transfer  of  control  may  be  achieved  by 
TS  Z  because  Z  is  the  program  counter;  only  the  low-order  12 
bits  of  (A)  are  significant,  being  the  address  of  the  instruction  to 
which  control  is  transferred.  Overflow  in  (A)  in  this  case  does  not 
aff^ect  the  transfer  but  sets  A  ^  ±  1 . 

L:  AD  K;  Add 

A  <^{A)  +  (K);  if  the  final  (A)  includes  ±  overflow, 
OVCTR  ^  (OVCTR)  ±1. 

Addition  is  the  most  frequently  used  combinatorial  operation 
(requirement  4).  The  property  of  OVCTR  is  used  chiefly  in  devel- 
oping double-precision  products  and  quotients,  partly  because  the 
additions  in  these  processes  are  less  susceptible  to  false  overflow 
than  are  multiple-precision  additions. 

L;  MASK  K;  Mask 

A^(A)  n  (K). 

This  is  the  only  combinatorial  Boolean  instruction,  and  may 
be  used  with  CS  to  generate  any  Boolean  function. 

Extracodes 

The  AGC  instruction  set  was  carried  over  in  large  part  from  its 
ancestor,  MOD  .3C  [Alonso  et  al.,  1961].  All  instructions  of  MOD 
3C  were  retained  in  the  AGC,  modifications  and  additions  being 
adopted  where  a  substantial  increase  in  computing  power  could 
be  obtained  at  small  cost.  The  MOD  3C  instruction  set  was  like 
the  one  described  above  for  the  AGC  with  two  major  exceptions: 
first,  instead  of  a  mask  instniction,  MOD  3C  had  a  multiply  in- 
struction. Second,  the  transfer  to  storage  instruction  did  not  in- 


clude the  property  of  skipping  on  overflow,  although  it  did  have 
properties  which  aided  masking. 

After  the  design  of  MOD  3C  was  completed,  it  was  discovered 
that  the  INDEX  instruction  could  be  used  to  expand  the  instruc- 
tion set  beyond  eight  instmctions  by  producing  overflow  in  the 
instruction  word  following  the  INDEX.  For  example,  the  addition 
of  octal  47777  to  the  instruction  word  "CS  K"  in  the  course  of 
an  INDEX  instruction  will  cause  negative  overflow,  producing  MP 
K,  a  multiply  instruction  with  operand  address  K. 

In  order  to  implement  the  extracodes  in  the  AGC,  it  was 
necessary  to  provide  a  path  from  the  high-order  4  bits  of  the  adder 
to  the  unaddressable  sequence  selection  register  S^.  Part  of  this 
path  is  the  unaddressable  buffer  register  B;  these  requirements 
helped  to  suggest  the  benefits  of  retaining  two  sign  bit  positions 
in  all  the  central  registers. 

In  principle,  eight  additional  instruction  codes  can  be  obtained 
by  causing  overflow,  but  we  did  not  feel  obliged  to  use  them  all. 
Because  every  extracode  must  be  indexed,  the  instructions  chosen 
for  this  class  had  two  properties  to  some  degree;  they  are  normally 
indexed,  or  they  take  long  enough  so  that  the  cost  of  indexing 
without  address  modification  is  small.  All  the  extracodes  are  com- 
binatorial, and  therefore  relate  to  requirement  4. 

L;  MP  K;  Multiply 

A  ^  upper  part,  LP  <—  lower  part,  of  (A)  •  (K);  the  two  words 
of  the  product  agree  in  sign,  which  is  determined  strictly  by  the 
sign  bits  of  the  operands. 

Experience  with  MOD  3C  showed  that  it  was  worthwhile 
making  a  completely  algebraic,  self-contained  multiply  instruction, 
especially  in  doing  double-precision  multiplication  whose  oper- 
ands have  independent  signs.  The  AGC  multiply  is  much  faster 
than  that  of  MOD  3C,  being  limited  by  adder  carry  propagation 
time  rather  than  core-switching  time. 

L;  DV  K;  Divide 

A  «— quotient,  Q<  | remainder],  of  (A)/(K);  LP <— nonzero 

number  with  the  sign  of  the  quotient. 

Many  facets  of  AGC  design  originally  adopted  for  other  reasons 
combined  to  make  a  divide  instruction  inexpensive.  The  foremost 
of  these  is  the  nature  of  the  editing  registers,  which  are  in  the 
standard  erasable  memory  and  have  no  special  wiring.  The  special 
properties  of  these  registers  are  supplied  by  a  shift  or  cycle  of  the 
word  being  written  into  the  memory  local  register  G,  when  the 
address  of  an  editing  register  is  selected.  The  central  loop  of  DV 
selects  such  an  address  and  inhibits  memory  operations,  so  that 
all  the  left  shifts  required  in  division  are  accomplished  in  the  G 
register  while  the  editing  register  itself  remains  imchanged.  The 
microprogrammed  nature  of  order  construction  makes  a  restoring 


Chapter  7  {  Some  aspects  of  the  logical  design  of  a  control  computer:  a  case  study  155 


algorithm  more  efficient  than  a  nonrestoring  one.  The  quotient 
delivered  to  A  has  a  sign  determined  according  to  normal  algebraic 
rules  by  the  signs  of  (A)  and  (K);  the  same  sign  is  available  in  LP 
to  aid  in  determining  the  correct  sign  of  the  remainder  from  those 
of  the  divisor  and  quotient  in  case  the  quotient  has  been  absorbed 
bv  subsequent  processing.  DV  is  not  usually  indexed,  but  it  pays 
such  large  benefits  in  space  and  time,  especially  in  double-pre- 
cision division,  that  the  cost  of  extracode  indexing  is  negligible. 
If  the  divisor  is  less  in  magnitude  than  the  dividend,  or  is  zero, 
the  quotient  has  correct  sign  and,  in  general,  maximum  magnitude. 
No  infinite  loop  results  in  any  case. 
L;  SU  K;  Subtract 

A<— (A)  — (K);  if  the  final  (A)  includes  ±  overflow, 
OVCTR  ^  (OVCTR)  ±1. 

The  primary  justification  for  this  instruction  is  that  it  allows 
multiple-precision  addition  subroutines  to  be  changed  into  multi- 
ple-precision subtract  subroutines  merely  by  changing  the  indexing 
(|uantity.  There  are  occasions  in  the  middle  of  involved  calcula- 
tions where  it  is  clumsy  to  constnict  a  subtraction  out  of  comple- 
mentations and  additions,  especially  when  the  sign  of  an  overflow 
is  of  interest.  Since  SU  differs  from  .\D  only  in  that  the  operand 
from  memory  is  read  out  of  the  complement  side  of  the  buffer 
register  B  rather  than  the  direct  side,  its  cost  is  virtually  zero. 
This  last  is  not  necessarily  true  when  using  core-transistor  logic, 
or  two's  complement  notation. 

7.    Expansion  of  memory  addressing 

The  .\GC's  12-bit  address  field  is  insufficient  for  specifying  directly 
all  the  registers  in  its  memory.  This  predicament  seems  increas- 
ingly to  afflict  most  computers,  either  because  indirect  addressing 
IS  assumed  as  a  necessary  evil  from  the  start  or,  as  was  our  case, 
because  our  earliest  estimates  of  memor\'  requirements  were  wrong 
liv  a  factor  of  two  or  three.  The  method  of  indirect  addressing 
we  arrived  at  uses  a  bank  register  MB,  but  with  an  important 
modification:  the  5-bit  number  stored  in  MB  has  no  effect  unless 
the  address  is  in  the  range  (octal)  6000  to  7777.  The  MB  register 
contents  are  not  interpreted  as  higher-order  bits  of  the  address; 
they  are  interpreted  as  integers  which  specify  which  bank  of  1024 
words  is  meant  in  the  event  of  the  address  part  of  the  instruction 
being  in  the  ambiguous  range.  The  over-all  map  of  memory  is 
shown  in  Table  2.  The  unambiguous,  fixed  memory  addresses 
domain  has  come  to  be  known  as  "fixed-fi.xed." 

It  is  interesting  that  this  method  of  extending  the  addressing 
capability  was  not  the  result  of  trying  to  improve  upon  more 
conventional  methods,  but  was  almost  a  consequence  of  the  phys- 


Table  2    Address  part  of  an  instruction  word 

(Decimal) 

0-3071     Fixed  and  erasable  memory;  unambiguous  addresses. 
3072-4095     Fixed  memory,  ambiguous  address.  Contents  of  MB 
used  to  resolve  the  ambiguity.  Up  to  32  such  banks 
are  possible. 


ical  difference  between  fixed  and  erasable  memory.  Since  all  data 
other  than  constants  are  concentrated  in  the  erasable  memory, 
these  had  to  be  exempt  from  modification  by  the  MB  register.  .\n 
alternative  arrangement,  whereby  only  the  addresses  of  instruc- 
tions (as  opposed  to  the  addresses  within  an  instruction  word)  are 
modified,  would  be  deficient  in  that  it  would  allow  only  instruc- 
tions to  be  stored  in  banks;  there  would  be  no  way  to  refer  to 
constants  stored  in  banks,  or  to  use  bank  addresses  to  store  argu- 
ments of  arithmetic  operations.  The  possibility  of  using  two  bank 
registers  is  worthy  of  serious  consideration  [Casale,  1962],  but  it 
did  not  occur  to  us. 

In  addition  to  the  addresses  in  erasable,  it  is  necessary'  to 
exempt  the  addresses  of  interrupting  programs  (i.e.,  the  addresses 
to  which  a  program  interrupt  transfers  control)  from  the  influence 
of  the  MB  register.  It  was  clear  that  it  would  be  valuable  to  have 
a  large  body  of  unambiguous  addresses  for  use  in  executive  and 
dispatcher  programs. 

The  most  frequent  and  critical  applications  of  bank  changing 
are  in  the  AGC's  interpretive  mode.  Most  of  the  programs  relevant 
to  navigation  are  written  in  a  parenthesis-free  pseudocode  notation 
for  economy  of  storage.  An  interpretive  program  executes  these 
pseudocode  programs  by  performing  the  indicated  data  accesses 
and  subroutine  linkages. 

The  format  of  the  notation  permits  two  macrooperators  (e.g., 
"double-precision  vector  dot  product")  or  one  data  address  to  be 
stored  in  one  .\GC  word.  Thus  data  addresses  appear  as  full  1.5-bit 
words,  potentially  capable  of  addressing  up  to  .32,768  registers. 
Each  such  address  is  examined  in  the  interpreter  and  the  contents 
of  the  bank  register  are  changed  if  necessary;  preparation  is  also 
made  for  subsequent  return  if  a  subroutine  call  is  being  made. 

The  stnicture  of  the  interpretive  program,  and  its  relationship 
to  the  computer  characteristics  discussed  in  this  paper  will  not 
be  taken  up  here  except  to  point  out  that  parenthesis-free  notation 
is  particularly  valuable  in  a  short -word  computer  such  as  the  AGC. 
It  permits  a  very  substantial  expansion  of  the  address  and  pseudo- 
operation  fields  without  sacrificing  efficiency  in  program  storage 
[Muntz,  1962]. 


156  Part  2     The  instruction-set  processor:  main-line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


The  conversion  of  a  15-bit  address  into  a  bank  number  and  an 
ambiguous  12-bit  address  is  as  follows:  the  top  5  bits  correspond 
directly  to  the  desired  bank  number.  The  remaining  lower-order 
10  bits,  logically  added  to  octal  6000,  form  the  proper  ambiguous 
address.  If  the  15-bit  address  is  less  than  octal  6000,  however,  the 
address  is  in  erasable  or  fixed-fixed  memory.  In  this  case  the  logical 
addition  of  octal  6000  is  suppressed. 

It  is  possible  to  have  a  program  in  one  bank  call  a  closed 
subroutine  in  another  bank,  and  then  have  control  returned  to  the 
proper  place  in  the  bank  of  origin.  This  is  done  by  means  of  a 
short  bank  switching  routine  which  is  in  fixed-fixed  memory. 

One  potential  awkwardness  about  this  method  of  extending 


memory  addresses  is  the  possible  requirement  for  a  routine  in  one 
bank  to  have  access  to  large  amounts  of  data  stored  in  another. 
There  are  many  programming  solutions  to  this  problem,  obviously 
at  a  cost  in  operating  speed;  a  better  solution  would  be  to  have 
two  bank  registers.  No  problems  of  this  nature  have  yet  material- 
ized, however. 

References 

AIonR&3;  AlonRBO;  A!onR61;  AlonR62;  BeckF61:  Ca,saC62;  EnglW62; 
HopkA6.3;  MuntC62;  RichR.55;  WaleW62;  Pwc.  Conf.  Spacebome  Com- 
puter Eng.;  Anaheim,  Calif.,  Oct.  30-31,  1962. 


APPENDIX  1    BACKGROUND  FOR  AGC  DESIGN 


Name, 
date 

completed 

Memory  size 
(F  =  fixed 
E  =  erwsahle) 

Number 
of  bits 

Number  of 
instructions 

Purpose 
of  design 

Features  incorporated 
at  this  stage 

MOD  1, 
1960 

F:448 
E;  64 

11  and  parity 

4  plus  involuntary 

Feasibility  Prototype 

Counter  increments, 
Interrupts, 

Core-Transistor  Logic, 
Pulse  rate  outputs. 
Editing  registers, 
Wired-in  fixed  memory. 
Interpretive  programs. 

MOD  2, 
not  built 

about  4000  total 

23  and  parity 

16  plus  indirect 

Unmanned  Space  Probe 

"Extended  Operation"  subroutine 
linkages  (only  instance). 

MOD  3S, 
1962 

F:  3584 
E:  512 

15  and  parity 

8 

Earth  Satellite 

Modified  one's  complement. 
Parallel  adder, 

Addressable  central  registers. 

MOD  3C, 
1962 

F:  greater  than  10-* 
E:  greater  than  10-* 

15  and  parity 

8  and  involuntary 

Apollo  Guidance 

CCS,  INDEX,  MULTIPLY  in- 
structions. 
Overflow  counter. 
Bank  switching. 

AGC, 
1963 

F:  greater  than  lO^ 
E:  greater  than  10^ 

15  and  parity 

1 1  and  involuntary 

Apollo  Guidance 

DV,  SU,  MSK  instructions. 

Editing  memory  buffer, 

All  transistor  NOR  logic  instead  of 

core-transistor  logic, 
Extracodes, 

Parenthesis-free  interpreter. 

Chapter  8 

The  UNIVAC  system^ 


J.  Piesper  Eckert,  Jr.  /  James  R.  Weiner 
II.  Frazer  Welsh  /  Herbert  F.  Mitchell 


Organization  of  the  UNIVAC  system 

In  March  1951,  the  first  UNIVAC^  system  formally  passed  its 
acceptance  tests  and  was  put  promptly  into  operation  by  the 
Bureau  of  the  Census.  Since  the  UNIVAC  is  the  first  computer 
which  can  handle  hoth  alphabetic  and  numerical  data  to  reach 
full-scale  operation  so  far.  its  operating  record  and  a  review  of 
the  types  of  problems  to  which  it  has  been  applied  provide  an 
interesting  milestone  in  the  ever-widening  field  of  electronic  digi- 
tal computers. 

The  organization  of  the  UNIVAC  is  such  that  those  functions 
which  do  not  directly  require  the  main  computer  are  performed 
by  .separate  aii.xiliary  units  each  having  its  own  power  supply.  Thus 
the  keyboard  to  magnetic  tape,  punched  card  to  magnetic  tape 
and  tape  to  typewritten  copy  operations  are  delegated  to  au-\iliar\ 
components. 

The  main  computer  assemblv  includes  all  of  those  units  which 
are  directlv  concerned  with  the  main  or  central  computer  opera- 
tions. A  block  diagram  of  this  arrangement  is  shown  in  Fig.  1.  ."Ml 
of  the  elements  shown  are  contained  within  the  central  computer 
casework  except  the  supervisory  control  desk  (SO  and  the  Uni- 
servos,-  to  which  the  lines  in  the  upper  right  section  of  the  diagram 
connect. 

The  supervisory  control,  in  addition  to  all  the  necessary-  control 
switches  and  indicator  lights,  contains  an  input  keyboard.  Also 
cabled  to  the  supervisory  control  is  a  typewriter  which  is  operable 
by  the  main  computer.  By  means  of  these  two  units,  limited 
amounts  of  information  can  be  inserted  or  removed  either  at  the 
will  of  the  operator  or  bv  the  programmed  instnictions. 

The  input-output  circuits  operate  on  all  data  entering  or  leav- 
ing the  computer.  The  input  and  output  synchronizers  properly 
time  the  incoming  or  outgoing  data  for  either  the  Uniservos  (tape 
devices)  or  the  supervisory  control  devices.  The  input  and  output 
registers  (/  and  O)  are  each  60  word  (720  characters)  temporary 
storage  registers  which  are  intermediate  between  the  main  com- 
puter and  the  input-output  devices. 

The  high-speed  bus  amplifier  is  a  switching  central  through 

^AIEE-IRE  Conf..  6-16,  December.  19.51. 
-Registered  trade  mark. 


which  all  data  must  pass  during  transfer  between  any  arithmetic 
register  and  the  main  memory  or  between  the  memory  and  the 
input-output  registers.  The  arithmetic  registers  are  shown  along 
the  bottom  of  diagram  each  connected  to  the  high  speed  bus 
system. 

The  L-,  F-,  .V-,  and  A-registers  are  each  of  one  word  or  12- 
character  capacity  and  are  directly  concerned  with  the  arithmetic 
operations.  The  V-  and  V-registers  are  of  2-  and  lO-word  capacity, 
respectivelv.  Thev  are  used  solely  for  mviltiple  word  transfers 
within  the  main  memory,  .\ssociated  with  the  arithmetic  registers 
are  the  algebraic  adder  (AA),  the  comparator  (CP),  and  the  multi- 
plier-quotient counter  {MQC). 

Addition-subtraction  instructions 

The  addition-subtraction  operations  are  performed  in  conjunction 
with  the  comparator  since  all  numerical  quantities  are  absolute 
magnitudes  with  an  algebraic  sign  attached.  Before  either  an 
addition  or  subtraction  is  performed,  the  two  quantities,  one 
already  in  the  A-register  and  the  other  either  from  the  memory 
or  from  the  A'-register,  depending  upon  the  particular  instruction, 
are  compared  for  magnitude  and  sign.  The  adder  inputs  can  then 
be  switched  so  as  alwavs  to  produce  a  noncomplemented  result 
for  anv  operation.  The  choice  of  adder  input  arrangement  is  there- 
fore under  the  control  of  the  comparator.  The  comparator  also 
determines  the  proper  sign  for  the  result  according  to  the  usual 
algebraic  rules. 

One  additional  function  performed  by  the  comparator  for  addi- 
tion and  subtraction  is  to  control  the  complementer.  This  deter- 
mination is  based  upon  which  operation  or  — )  is  indicated, 
and,  whether  the  signs  are  like  or  unlike.  For  a  subtract  instruction, 
the  sign  of  the  subtrahend  is  reversed  before  entering  the  com- 
parator. The  comparator  then  compares  the  signs  of  the  quantities 
in  order  to  determine  whether  the  two  quantities  are  subtracted 
or  added. 

Multiplication  instruction 

The  multiplication  process  requires  the  services  of  the  adder,  the 
comparator,  the  multiplier-quotient  counter  and  the  four  arith- 
metic registers.  During  the  first  step  of  multiplication  the  .\'-ree- 


158  Part  2  j  The  instruction-set  processor:  main-line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


STANDARD  PULSES 
TO  ALL  UNITS 


CONTROL  BUSSES 


PULSES 


Fig.  1.  Block  diagram  of  UNIVAC. 


Chapter  8  |  The  UNIVAC  system  159 


ister  receives  the  multiplier  from  the  memory  and  the  comparator 
determines  the  sign  of  the  final  product  by  comparing  the  signs 
of  the  multiplier  and  multiplicand.  During  the  next  three  steps 
the  multiplicand,  which  has  been  stored  in  the  L-register  by  some 
previous  instruction,  is  transferred  three  times  to  the  A-register 
through  the  algebraic  adder.  The  result,  three  times  the  multi- 
plicand, is  then  stored  in  the  F-register.  During  the  ne.xt  11  steps 
of  multiplication,  the  successive  multiplier  digits,  beginning  with 
the  least  significant,  are  transferred  from  the  X-register  to  the 
multiplier-quotient  counter.  The  multiplier-quotient  counter  then 
determines  whether  each  particular  multiplier  digit  is  less  than 
three,  or  greater  than  or  equal  to  three. 

If  the  former,  the  L-register  releases  the  multiplicand  to  the 
A-register  via  the  adder,  and  the  multiplier-quotient  counter  is 
stepped  downward  one  unit.  If  the  multiplier  digit  is  equal  to  or 
greater  than  three,  the  nuiltiplier-quotient  counter  sends  a  signal 
to  the  F-register  which  releases  three  times  the  multiplicand  to 
the  A-register  and  the  multiplier-quotient  counter  is  stepped  three 
times.  Thus  a  multiplier  digit  of  seven  would  be  processed  as  two 
transfers  from  the  F-register  to  the  A-register  and  one  transfer  from 
the  L-register  to  the  ,A-register,  or  a  total  of  three  transfers. 

When  the  multiplier-quotient  counter  reaches  zero,  the  ne.xt 
multiplier  digit  is  brought  in  from  the  .V-register,  while  the  .A-reg- 
ister, containing  the  first  partial  product,  is  shifted  one  position 
to  the  right. 

During  the  final  step  of  multiplication,  the  sign  is  attached  to 
the  product  which  has  been  built  up  in  the  A-register.  One  of  the 
several  available  multiplication  instructions  causes  the  least  sig- 
nificant digits,  as  thev  are  shifted  bevond  the  limits  of  the  A-reg- 
ister, to  be  transferred  to  the  .V-register  where  thev  replace  the 
multiplier  digits  as  thev  are  moved  to  the  multiplier-quotient 
counter.  Thus  22  place  products  can  be  obtained  as  well  as  11 
place. 

Division  instruction 

The  division  operation  is  performed  by  a  nonrestoring  method.  The 
divisor  is  stored  in  the  L-register  by  some  previous  instmction  and 
the  dividend  is  brought  from  the  memory  and  put  in  the  A-register 
during  the  first  step  of  the  division  instruction.  As  in  multiplica- 
tion, the  signs  of  the  two  operands  are  compared  in  the  comparator 
at  this  time  and  the  sign  of  the  quotient  is  then  stored  in  the 
comparator  pending  completion  of  the  division  operation.  The 
principal  stages  of  division  consist  of  transferring  the  divisor  from 
the  L-register  to  the  A-register  through  the  complementer  and 
adder  as  manv  times  as  required  to  produce  a  quantity  less  than 
zero  in  the  A-register,  the  dividend  having  been  first  shifted  one 


position  to  the  left.  The  multiplier-quotient  counter  counts  each 
transfer,  thereby  building  up  the  first  quotient  digit.  As  soon  as 
the  quantity  in  the  A-register,  (neglecting  its  original  sign)  goes 
negative,  the  digit  in  the  multiplier-quotient  counter,  not  counting 
the  transfer  which  causes  the  remainder  to  go  negative,  is  trans- 
ferred to  the  .V-register  and  the  remainder  in  the  A-register  is 
shifted  one  place  to  the  left.  The  divisor  is  then  added  to  the 
A-register  until  the  quantity  becomes  positive.  This  time  the 
multiplier-quotient  counter  must  give  the  complement  of  the 
number  of  transfers  for  the  real  quotient  digit.  Special  comple- 
menting read-out  gates  provide  this  method  of  interpreting  the 
multiplier-quotient  counter. 

The  .V-register  therefore  collects  the  quotient,  digit  by  digit, 
from  the  multiplier-quotient  counter  until  the  full  11  digits  have 
been  obtained.  The  quotient  is  then  transferred  to  the  A-register 
and  the  sign  from  the  comparator  (CP)  is  affi.xed  during  the  final 
stage  of  the  divide  instruction. 

The  other  internal  operations  of  the  UNIVAC  include  many 
transfer  instructions  by  which  words  may  be  moved  among  the 
registers  and  memory  with  and  without  clearing,  the  extraction 
instmction  by  which  certain  digits  of  a  word  mav  be  extracted 
into  another  word  according  to  the  parity  of  the  corresponding 
digits  of  an  extractor  word;  shift  instructions;  and  special  control 
instnictions  such  as  breakpoint,  transfer  of  control,  (explained  in 
subsequent  paragraphs)  and  stop. 

Basic  operating  cycle 

The  basic  operating  cvcle  of  the  UNIVAC  is  founded  upon  single 
address  instructions  which  specif)'  the  memorv  location  of  one 
word.  In  the  case  of  the  arithmetic  instnictions  which  require  two 
operands,  one  of  the  operands  must  be  moved  into  the  proper 
register  by  some  previous  instruction.  In  order  to  control  the 
sequence  of  instnictions,  a  special  counter,  called  the  control 
counter  (CC),  retains  the  memory  location  from  which  the  succeed- 
ing instmction  word  is  to  be  obtained.  Each  time  a  new  instmction 
word  is  received  from  the  memorv,  the  quantity  in  the  control 
counter  is  passed  through  the  adder  where  a  unit  is  added  to  it. 
Therefore  the  normal  sequence  is  to  refer  to  successive  memor\' 
locations  for  successive  instmction  words.  Initially  the  control 
coimter  is  cleared  to  zero  and  the  first  group  of  instmctions  must, 
therefore,  be  placed  in  memorv  locations  from  zero  upward.  A 
transfer  of  control  instmction  enables  the  programmer  to  change 
the  control  counter  reading  whenever  desired  and  thus  shift  from 
one  .sequence  to  another.  .After  a  transfer  of  control  takes  place, 
the  new  number  in  the  control  counter  is  increased  bv  unitv  each 
time  a  new  instmction  word  is  obtained  from  tlie  memor\'. 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  1      Processors  with  one  address  per  instruction 


Transfer  of  control  instructions 

The  transfer  of  control  instructions  are  of  three  types,  the  uncon- 
ditional transfer  which  changes  the  control  counter  reading  with- 
out question,  and  two  conditional  instnictions  which  require  that 
either  equality  or  a  specific  inequality  exists  between  the  words 
in  the  A-register  and  the  L-register.  In  the  former  case  the  quan- 
tities must  be  identical  for  transfer  of  control  to  occur  and  in  the 
latter  the  quantity  in  the  A-register  must  be  greater  than  the 
quantity  in  the  L-register  for  the  control  counter  reading  to  be 
changed. 

Since  the  UNIVAC  can  handle  alphabetic  as  well  as  numerical 
data,  these  conditional  transfer  instructions  are  as  useful  for  alpha- 
betizing as  they  are  to  determine  if  a  certain  iterative  arithmetic 
process  has  been  performed  often  enough  to  come  within  specified 
numerical  tolerances. 

Control  register 

Since  six  characters  (intermixed  alphabetic  and  numerical)  are 
sufiicient  to  specify  an  instruction  and  there  are  12  characters  per 
word,  each  instruction  word  can  represent  two  independent  in- 
structions. A  1-word  register,  called  the  control  register  (CR),  has 
been  provided  which  stores  each  instruction  word  as  it  comes  from 
the  memory.  Thus  one  memory  referral  is  sufiicient  for  a  pair  of 
instnictions  and  the  control  register  stores  both  halves  so  that  the 
second  instruction  is  available  as  soon  as  the  first  has  been  com- 
pleted. 

The  general  term  control  circuits  includes  all  those  elements 
which  work  together  to  process  the  instruction  routine.  As  each 
instruction  word  reaches  the  control  register,  the  first  half  of  it 
is  passed  immediately  into  the  static  register  (Sfi).  The  static 
register  drives  the  main  fimction  table  and  memory  switch.  The 
instruction  digits  are  translated  by  the  fimction  table  into  the 
appropriate  control  signals  for  the  instruction  called  for.  The 
memory  switch  selects  the  location  called  for  by  the  memory 
location  digits  and  opens  the  proper  memory  channel  to  the  high- 
speed bus  system  at  the  proper  time.  Since  the  memory  is  con- 
structed of  100  channels,  each  holding  ten  words,  the  memory 
switch  is  a  combination  of  spatial  and  temporal  selection. 

Cycle  counter 

Implicit  within  each  instruction,  as  translated  by  the  function 
table,  is  an  ending  signal  which  causes  the  computer  to  move  on 
to  the  next  instruction.  The  key  to  this  sequence  is  the  cycle 
counter  (CY),  which  is  advanced  by  the  ending  pulse.  The  cycle 
counter  is  a  2-stage  4-position  counter,  which  is  connected  into 


the  function  table.  By  virtue  of  this  relation,  CY  develops  signals 
in  addition  to  those  developed  by  the  instruction,  which,  for  ex- 
ample, can  cause  the  control  register  to  transfer  the  second  half 
of  the  instruction  word  into  the  static  register  when  the  first  half 
has  been  completed.  Similarly,  after  the  second  half  instmction 
is  finished  the  cycle  counter  causes  the  reading  of  the  control 
counter  to  pass  into  the  memory  location  section  of  the  static 
register  and  thus  cause  the  next  instruction  word  to  be  transferred 
from  the  memory  to  the  control  register.  When  the  word  reaches 
the  control  register,  the  cycle  counter  also  causes  the  control 
counter  reading  to  be  increased  by  unity.  The  four  cycles  are 
designated  by  the  first  four  Greek  letters  a  (transfer  CC  to  Sfi), 
IS  (transfer  memory  to  CR),  y  (perform  first  instruction),  and  S 
(perform  second  instruction). 

Program  counter 

The  multistage  instnictions,  such  as  multiplication,  are  gviided 
through  their  various  steps  by  the  program  counter  (PC).  The 
program  counter  has  four  stages  or  16  positions.  All  multistage 
instructions  can  be  performed  within  this  number  of  steps. 

Checking  circuits 

The  checking  circuits  of  the  UNIVAC  are  of  two  main  types, 
odd-even  checkers  and  duplicated  equipment  with  comparison 
circuits.  The  odd-even  checker  depends  upon  the  design  of  the 
pulse  code  used  within  the  computer.  This  code  provides  seven 
pulse  positions  for  every  character.  Six  of  the  seven  positions  are 
significant  as  the  actual  code  while  the  seventh  is  the  odd-even 
channel.  If  the  number  of  pulses  or  ones  within  the  first  six  chan- 
nels of  any  character  is  even,  a  one  is  placed  in  the  seventh  channel 
to  make  the  total  odd.  Thus,  the  total  number  of  ones  across  the 
seven  channels  is  always  odd.  By  means  of  a  binary  counter  and 
a  few  gates,  an  odd-even  checker  has  been  constructed  which 
examines  every  seven  pulse  group  which  passes  through  the  high 
speed  bus  amplifier.  In  this  connection,  mention  must  be  made 
of  the  periodic  memory  check  which  intermpts  operation  every 
five  seconds  to  pass  the  entire  contents  of  the  memory  over  the 
high  speed  bus  system  and,  consequently,  through  the  odd-even 
checker.  Any  discrepancy  is  immediately  signalled  to  the  super- 
visory control  and  further  operation  ceases. 

The  duplicated  equipment  type  of  checking  consists  of  dupli- 
cating the  most  essential  part  of  the  arithmetic  circuits  and  their 
controls  and  producing  simultaneously  independent  results,  which 
can  then  be  compared  for  equality.  For  this  type  of  checking,  the 
A-,  F-,  A-,  and  L-registers,  algebraic  adder,  comparator,  multi- 


Chapter  8  j  The  UNIVAC  system  161 


plier-quotient  counter,  and  the  hii^h  speed  bus  amplifier  are  dupli- 
cated. 

The  memory  is  not  duplicated,  but  is  checked  by  the  periodic 
memory  check  mentioned  previously.  Various  sections  of  the  con- 
trol circuits  are  duplicated  such  as  the  program  counter  and  cycle 
counter. 

Timing  pulse  generator  and  cycling  unit 

The  timing  pulse  generator  and  cycling  unit  (CU)  are  the  source 
of  the  basic  timing  signals  throughout  the  computer.  The  timing 
pulses  occur  at  2.25  megacycles  per  second.  The  cycling  unit 
subdivides  this  rate  into  the  character  rate  and  word  rate.  The 
character  rate  is  one  seventh  of  the  basic  pulse  rate  since  there 
are  seven  pulses  for  each  character.  There  are  12  characters  per 
word  but  space  for  a  13th  character  is  included  in  a  word  time 
and  is  called  the  space  between  words.  This  time  is  used  for 
switching  purposes. 

The  cycling  unit,  therefore,  develops  the  word  signals  at 
X  Yis  or       of  the  basic  pulse  rate.  Within  the  cycling  unit 
(CU)  are  numerous  duplications  and  comparisons  to  ensure  com- 
plete reliabilitv. 

Input-output  circuits 

The  operation  of  the  input-output  system  is  dovetailed  as  effi- 
ciently as  possible  with  the  operation  of  the  arithmetic  circuits. 
Whenever  possible,  parallel  operations  are  allowed  to  proceed  so 
as  to  minimize  the  time  lost  on  internal  operation  while  the  slower 
input-output  operations  are  taking  place. 

The  principal  input-output  instructions  are  handled  in  a  man- 
ner identical  to  that  for  the  internal  operations,  except  that  now 
the  function  table  develops  signals  which  bring  the  input-output 
control  circuits  into  operation.  The  information  supplied  to  the 
input-output  control  circuits  bv  the  fimction  table  includes  the 
following: 

1  Which  of  the  ten  possible  Uniservos  is  being  called  on 

2  Whether  it  is  a  read  or  write,  that  is,  an  input  or  output 
operation 

3  If  it  is  "read,"  the  direction  in  which  the  tape  is  to  move 

The  input-output  control  circuits,  therefore,  begin  by  testing 
whether  or  not  the  Uniservo  indicated  now  is  in  use  or  not.  If 
it  is  alreadv  in  use,  everything  else  waits  until  that  Uniservo  is 
free.  Next,  the  input-output  control  circuits  test  to  determine 
whether  the  Uniservo  selected  last  moved  backward  or  forward. 


If  the  previous  direction  does  not  agree  with  the  new  direction 
called  for,  the  input-output  control  circuits  generate  the  proper 
signals  to  prepare  the  Uniservo  to  move  in  the  opposite  direction. 
If  the  instniction  is  to  rewind  a  Uniservo,  the  input-output  control 
circuits  then  direct  the  center  drive  of  the  selected  Uniservo  to 
rewind  the  tape  to  the  beginning  and  stop. 

As  soon  as  the  instruction  has  proceeded  to  the  point  where 
the  input-output  control  circuits  need  no  further  information  from 
the  function  table,  the  instruction  ending  signal  is  generated 
and  the  internal  circuits  proceed  to  the  next  instruction,  even 
while  the  reading,  writing  or  rewinding  continues.  The  UNIVAC 
can  process  an  input,  an  output  and  several  rewind  operations 
while  siniultaneouslv  carrying  on  internal  computation. 

So  far  the  method  by  which  the  words  are  transferred  from 
the  /-register  to  the  memory  has  not  been  mentioned.  This  opera- 
tion is  combined  with  certain  read  instructions  in  a  manner  not 
immediately  obvious.  There  are  two  instructions  which  read  from 
the  tape  to  the  /-register,  one  causing  the  tape  to  move  forward, 
the  other  causing  it  to  move  backward.  There  are  two  other  input 
instnictions  similar  to  those  just  mentioned,  but  thev  have  the 
additional  operation  of  first  reading  from  the  /-register  to  the 
memory  and  then  reading  a  new  group  of  60  words  from  tape  into 
the  /-register.  Thus  the  first  type  of  input  instruction  reads  from 
tape  to  the  /-register  only.  It  must  be  followed  by  the  second  type 
of  instruction  in  order  first  to  clear  the  /-register  and  then  read 
in  the  second  block  of  60  words. 

The  output  instmctions  do  not  operate  in  this  wav  but  instead 
read  directly  from  memorv  to  the  O-register  and  then  to  the  tape 
as  one  instruction. 

A  third  type  of  checking  circuit  occurs  in  the  input-output 
control  circuits  which  counts  the  number  of  characters  transferred 
from  the  tape  in  each  block.  Since  there  must  always  be  720 
characters  per  block,  the  720  checker  signals  any  discrepancy  to 
the  supervisorv  control. 

One  other  phase  of  the  input-output  operation  concerns  the 
two  supervisorv  control  input-output  instructions.  One  of  them 
permits  a  single  word  to  be  typed  in  from  the  input  keyboard  and 
the  other  causes  a  single  word  to  be  typed  out  automatically. 

Auxiliary  equipment 

The  two  principal  auxiliary'  devices  mentioned  earlier  were  the 
Unitvper,^  which  converts  keyboard  operations  to  tape  recording, 
and  the  Uniprinter,^  which  converts  magnetic  recording  to  type- 
written copv. 

'Registered  trade  mark. 


162  Part  2     The  instruction-set  processor:  main  line  computers 


Section  1     Processors  with  one  address  per  instruction 


Unityper.  A  simple  block  diagram  of  the  Unityper  is  shown  in  Fig. 
2.  Each  keyboard  operation  pulses  the  input  to  an  encoding  func- 
tion table  which,  in  turn,  drives  the  appropriate  heads  for  record- 
ing the  particular  combination  on  the  tape.  Simultaneously,  the 
same  pulse  triggers  a  motor  delay  flop  which  operates  the  tape 
motor  for  an  interval  sufficient  to  move  the  tape  across  the  head 
for  the  distance  required  to  record  one  character.  However,  there 
is  a  punched  paper  loop  system  associated  with  the  Unityper  for 
the  purpose  of  providing  the  typist  with  various  guideposts  individ- 
ually set  up  for  each  problem.  The  loop  control  system  serves  three 
distinct  control  functions.  First,  it  allows  the  programmer  to  set 
up  various  numbers  of  characters  for  the  individual  items  being 
entered  for  a  given  problem.  If  the  tvpist  ever  enters  other  than 
the  specified  number  of  characters,  the  loop  control  signals  an 
error.  Although  the  basic  word  length  is  12  characters,  the  pro- 
grammer may  subdivide  or  group  the  words  to  suit  any  length  of 
item.  The  loop  can  then  be  punched  with  what  are  called  "force 
check"  punches.  Whenever  the  typist  completes  a  correctly  en- 
tered item,  she  must  operate  a  release  key  before  entering  the  next 
item.  If  the  forced  check  is  released  too  early  an  error  is  created, 
or  if  an  additional  character  is  typed  after  the  forced  check  should 
have  been  released,  an  error  is  similarly  indicated. 

The  second  function  of  the  loop  is  to  control  the  erase  opera- 
tion. The  erase  operation  is  the  only  way  in  which  an  error  can 
be  recalled.  When  the  erase  key  is  operated,  the  loop  and  tape 


ENCODING 
FUNCTION 
TABLE 


RECORDING 
HEAD 


J  LOOP  I 
MOTOR  I 


DFi  I 


LOOP 
PHOTOCELLS 


INTERPRETING 
RELATS 


Fig.  2.  Simplified  block  diagram  of  Unityper. 


are  both  stepped  backward  until  a  stop  punch  (usually  associated 
with  each  forced  check)  is  encountered.  Thus  the  entire  erroneous 
item  is  erased,  and  at  a  much  higher  rate  than  that  at  which  the 
backspace  key  can  be  operated.  The  backspace,  incidentally,  can- 
not cancel  an  error  indication,  but  it  can  be  used  to  correct  a 
wrongly  typed  character  if  the  typist  recognizes  it. 

The  third  fimction  of  the  loop  system  is  to  enter,  automatically, 
various  flU-in  characters.  Under  one  such  system  of  operation,  the 
loop  control  records  the  characters  only  at  the  behest  of  the  oper- 
ator. This  function  is  useful  where  individual  entries,  such  as 
personal  names,  do  not  fill  out  all  of  the  space  allotted.  The  other 
operation  is  fully  automatic  in  which  the  loop  assumes  full  control 
to  record,  for  example,  a  group  of  fill-in  characters  later  to  be 
replaced  by  computed  data  within  the  central  computer. 

The  block  diagram  therefore  shows  the  loop  motor  connected 
to  the  same  delay  flop  that  steps  the  tape  motor.  The  same  signal 
which  moves  the  two  motors  also  sets  a  second  delay  flop  (DF2) 
which  produces  a  delayed  probing  pulse.  The  probing  pulse  exam- 
ines the  paper  loop  photoelectrically  for  the  new  combination. 
A  third  delay  flop  (DF3)  produces  another  probing  pulse  after  the 
relays  associated  with  the  loop  photocells  have  had  time  to  set 
up.  If  any  automatic  fimction  is  indicated  by  the  photocells,  the 
probing  pulse  passes  through  the  interpreting  relays,  enters  the 
encoding  function  table  to  generate  the  fill-in  characters,  and  thus 
starts  the  cycle  over  again.  All  automatic  functions  take  place  at 
about  22  characters  per  second. 

Numerous  odd-even  checks  are  introduced  in  the  Unityper  to 
provide  checks  on  tape  and  loop  motion  and  on  the  recorded  code 
combination. 

Uniprinter.  The  Uniprinter  is  shown  in  simplified  block  diagram 
in  Fig.  3.  Its  operation  is  a  simple  cycle  which  is  initiated  by  a 
start  button.  The  start  button  triggers  the  motor  flip-flop  (MFF). 
The  motor  pulls  the  tape  across  the  reading  head  until  a  combina- 
tion is  detected.  The  presence  of  pulses  on  any  of  the  seven  lines 
between  the  reading  head  and  the  relay  decoding  fimction  table 
is  sufficient  to  restore  the  motor  flip-flop  (MFF)  and  stop  the  tape 
motion.  Simultaneously  a  print  delay  flop  (DFl)  is  triggered. 
During  the  delay  flop  interval,  the  decoding  relays  are  given  time 
to  set  up.  When  the  delay  flop  recovers,  a  pulse  is  sent  through 
the  relay  table  which  reappears  at  one  of  the  typewriter  magnetic 
actuators.  As  the  typebar  reaches  the  platen,  a  printer  action 
switch  (FAS)  is  operated  which  pulses  the  motor  flip-flop  and  starts 
a  new  search  for  the  next  character  on  the  tape.  The  odd-even 
properties  of  the  UNIVAC  pulse  code  are  utilized  for  checking 
purposes. 


Chapter  8  j  The  UNIVAC  system  163 


Fig.  3.  Simplified  block  diagram  of  Uniprinter. 
Engineering  aspects 

The  entire  UNIVAC  system  is  constructed  of  circuits  which  are 
as  conservative  as  is  consistent  with  the  desired  rehability  and 
speeds  of  operation.  The  circuits  have  been  designed  as  building 
blocks  and  the  entire  computer  is  constructed  around  these  blocks. 

One  of  the  most  important  of  these  blocks  is  the  pulse  reshap- 
ing circuit  which  consists  of  a  timing  pulse  gate  and  a  fast  acting 
flip-flop  which  generates  the  pulse  envelope  equivalent  of  the 
gated  timing  pulses.  Two  polarities  of  timing  pulse  are  used,  the 
one  being  capable  of  tripping  the  flip-flop  into  one  state,  the  other 
polaritv  of  tripping  it  to  the  other  state.  .\s  a  deteriorated  pulse 
envelope  is  applied  to  the  timing  pulse  gate  input,  either  one  or 
the  other  polarity  of  pulse  is  always  gated.  The  flip-flop  therefore 
produces  a  sharpened  and  correctly  timed  output  waveform. 

The  gating  and  switching  circuits  in  the  central  computer  are 
constructed  of  germanium  crystal  diodes,  which  include  the  main 
and  subordinate  function  tables. 

The  registers  are  all  circulating  delay  tvpe  using  a  mercury 
tank  of  one,  two,  or  ten  word-times  of  delay,  except  the  static 
register.  The  latter  is  composed  of  27  flip-flops  which  are  required 
to  maintain  the  static  signals  applied  to  the  function  tables,  for 
at  least  an  entire  word-time. 

The  switching  time  allowed  by  the  seven  pulse-times  of  the 
space  between  words  is,  in  general,  not  sufficient  for  a  new  func- 
tion table  e.\citation  to  stabilize.  Therefore  the  time-out  system 
used  successfully  in  the  BINAC,  also  is  employed  in  the  UNIVAC. 
Whenever  an  ending  pulse  is  generated,  or  any  other  pulse  which 
indicates  that  a  new  set  of  control  signals  are  required  from  the 


function  table,  an  interval  of  one  word-time  is  introduced  to  allow 
the  fimction  table  signals  to  reach  equilibrium.  The  time-out  in- 
terval is  controlled  by  a  single  fast-acting  flip-flop.  .\11  gates 
attached  to  the  function  table  signals  which  are  critical  as  to 
opening  and  closing  can  be  inhibited  bv  the  time-out  flip-flop 
during  time  out.  Regardless  of  the  presence  of  the  function  table 
signals,  the  gate  does  not  operate  until  the  time-out  flip-flop  re- 
leases it.  Thus,  the  burden  of  speed  imposed  bv  the  short  space 
between  words  has  been  shifted  to  a  single  flip-flop  which  can 
accommodate  the  needs  of  the  entire  computer. 

The  UNIVAC  uses  the  excess-three  pulse  code  system  which 
requires  a  second  binary  adder  after  the  main  binary  adder  in  order 
to  provide  the  excess-three  correction  after  each  addition.  On  the 
other  side  of  the  ledger,  the  complementing  operation  for  sub- 
traction and  division  is  very  much  simplified,  since  the  substitution 
of  ones  for  zeros  and  vice  versa  is  sufficient  to  form  a  complement. 
The  excess-three  part  of  the  pulse  code  occupies  the  four  least 
significant  digit  positions.  The  next  two  positions  beyond  the 
excess-three  digits  are  used  as  zone  indicators.  When  these  digits 
are  both  zero,  the  last  four  positions  are  interpreted  as  a  numerical 
quantity;  when  nonzero,  an  alphabetic  or  punctuation  symbol  is 
indicated.  The  seventh  channel  is  the  check  pulse  channel. 

The  adder  is  provided  with  an  alphabetic  bypass  circuit  which 
allows  an  alphabetic  letter  to  enter  one  input  and  emerge  un- 
scathed provided  a  numeral  enters  the  other  input.  Thus  additive 
numerical  constants  can  be  combined  with  instniction  words  to 
adjust  the  memory  location  part  of  an  instruction  without  affecting 
the  alphabetic  instniction  symbols. 

The  power  supply  for  the  computer  is  separately  housed.  It 
can  be  placed  any  reasonable  distance  from  the  central  computer. 
.\lmost  all  rectification  is  done  by  dry  disc  rectifiers.  The  power 
supply  provides  all  a-c  and  d-c  potentials  to  the  central  computer, 
supervisory  control,  directly-connected  printer,  and  the  Uniservos. 

\  complete  fusing  system  has  been  included  which  serves  both 
as  protection  and  as  a  short-circuit  isolating  means.  Each  section, 
of  which  there  are  .39,  is  locally  fused,  enabling  the  engineer  to 
locate  a  short  within  only  12  chassis,  rather  than  the  total  of  468. 

An  automatic  voltage  monitoring  system  may  be  used  to  test 
every  d-c  voltage  at  the  rate  of  one  per  second.  A  meter  movement 
relay  signals  any  discrepancy  from  standard.  Similarly,  overheat 
thermostats  detect  any  unfavorable  temperature  condition  in  the 
bays  or  mercury  tanks. 

Cooling  for  the  power  supply  and  central  computer  is  provided 
by  three  blowers.  Local  cooling  in  the  Uniserv  os  is  provided  by 
small  fans  in  each  unit.  The  operating  statistics  of  the  UNIVAC 
are  as  follows: 


164  Part  2  I  The  instruction-set  processor:  main  line  computers 


Section  1     Processors  with  one  address  per  instruction 


Tape  reading  and  recording: 
Pulse  density:  120  per  inch 
Tape  speed:  108  inches  per  second 
Input  block  size:  60  words:  720  characters 
Tape  width:      inch:  8  channels 

Internal  operations: 

Memory  capacity:  1,000  words;  12,000  characters 
Memory  construction:  100  mercury  channels;  10  words/ 
channel 

Access  time: 

Average:  202  microseconds 
Maximum:  404  microseconds 

Word  length: 
12  characters 
9  pulses 

(include  space  between  words  =  7  pulses) 

Basic  pulse  rate: 
2.25  megacycles 
Addition:  525  microseconds 
Subtraction:  525  microseconds 
Multiplication:  2,150  microseconds 
Division:  3,890  microseconds 

(All  times  shown  include  time  for  obtaining  instmctions  and 
operands  from  memory) 


Applications  of  UNIVAC 

Types  of  problems  for  which  UNIVAC  is  applicable 

True  to  its  name,  Universal  Automatic  Computer,  the  UNIVAC 
system  is  capable  of  handling  data  processing  or  calculation  in 
virtually  all  fields  of  human  endeavor.  It  is  particularly  well  suited 
to  applications  requiring  large  volumes  of  input  or  output  data, 
or  both. 

For  convenience  and  classification,  applications  of  the  UNIVAC 
will  be  treated  under  four  headings:  scientific,  statistical,  logistical, 
and  commercial.  The  scientific  problem  usually,  though  not  al- 
ways, has  relatively  small  amounts  of  input  and  output  data,  with 
emphasis  on  computation.  The  statistical  problem  has  relatively 
large  volumes  of  input  data  with  a  small  volume  of  output  data 
and  simple  processing  procedures.  The  commercial  and  logistical 
problems  both  have  relatively  large  amounts  of  input  and  output 
data  with  processing  requirements  varying  from  slight  to  relatively 
great.  A  number  of  problems  in  each  of  these  four  fields  have  been 
studied  and  found  suited  for  solution  on  the  UNIVAC  system. 
Several  in  each  field  have  actually  been  processed  on  the  com- 
puter. 


Scientific  problems 

A  general-purpose  matrix  algebra  routine  designed  to  add,  sub- 
tract, multiply,  and  reciprocate  matrices  of  orders  up  to  .300  has 
been  prepared  and  applied  to  a  number  of  matrices.  Inverses  have 
been  calculated  for  three  different  matrices  of  orders  40,  50,  and 
44.  The  error  matrices  for  the  first  two  of  these  inverses  also  were 
calculated.  In  both,  the  largest  error  term  was  of  the  order  of  10~*. 
A  triple  product  matrix  was  formed  from  component  matrices 
ranging  from  5  by  40  to  40  by  40.  A  check  product  was  obtained 
by  reversing  the  sequence  of  multiplications,  verifying  the  original 
product  to  within  2  units  in  the  11th  place.  The  computer  time 
required  for  these  calculations  was  1  hour  and  15  minutes  to 
calculate  the  inverse  of  order  50,  45  minutes  to  determine  its  error 
matrix.  The  other  calculations  were  proportionately  shorter.  In  all 
of  this  work,  magnetic  tapes  were  used  as  temporary  storage  for 
the  bulk  of  the  matrix  elements  involved.  The  high  speed  of  the 
tape  reading  units  more  than  kept  up  with  the  computer's  need 
for  data.  No  mathematical  checks,  other  than  the  over-all  check 
mentioned,  were  included  in  the  computation,  the  self-checking 
features  of  the  system  making  these  completely  unnecessary. 

A  second  computation — that  of  obtaining  six  different  specific 
solutions  to  a  system  of  385  simultaneous  equations — was  com- 
pleted in  27  minutes  on  the  computer.  The  system  of  equations 
arose  from  a  second  order  nonlinear  differential  equation  of  gas 
flow  through  a  turbine.  The  error  terms  resulting  from  the  sub- 
stitution of  the  computed  unknowns  into  the  basic  equation  were 
of  the  order  of  10"'^ 

The  third  example  is  that  of  a  2-dimensional  Poisson  equation, 
using  a  22  by  22  mesh.  Each  iteration  required  13  seconds  and 
produced  a  maximum  separation  of  successive  surfaces  of  the  order 
of  10"^  after  approximately  300  iterations. 

Statistical  problems 

In  the  second  major  field  of  statistical  computation,  the  Census 
problem  has  been  a  prime  example.  The  Census  problem  produces 
a  part  of  the  Second  Series  Population  on  Tables  for  the  1950 
Decennial  Census. 

The  Second  Series  contains  .30  types  of  tables  covering  the 
statistics  of  our  population — age,  sex,  race,  country  of  birth,  edu- 
cation, occupation,  employment,  and  income.  These  tables  are  to 
be  compiled  for  every  county,  and  for  every  city,  rural  farm,  and 
rural  nonfarm  area  within  a  county. 

The  preparation  of  these  tables  by  the  UNIVAC  system  requires 
three  major  steps: 

1    Tabulation  of  each  individual's  characteristics  by  groups  of 
about  7,000 


Chapter  8  j  The  UNIVAC  system  165 


2  Arranging  these  groups  by  cities,  counties 

3  Assembling  from  the  tabulations  the  data  required  for  each 
table  ■ 

The  raw  data  were  prepared  in  the  form  of  a  punched  card 
for  each  individual  in  the  United  States.  The  data  from  these 
enumeration  cards  are  then  transcribed  onto  magnetic  tape.  From 
these  tapes,  the  computer  processes  the  data  sequentially  through 
the  three  steps,  producing  output  tapes  from  which  the  tables  are 
printed  on  Uniprinters.  The  onlv  manual  operations  encoiuitered 
in  this  entire  procedure  are  the  handling  of  the  original  punched 
cards,  mounting  and  demounting  tape  reel  (the  equivalent  of  9,7(){) 
cards),  and  the  removal  of  the  printed  tables  from  the  Uniprinters. 

The  most  important  feature  of  the  present  procedure  is  the  elim- 
ination of  handling  and  sorting  tremendous  quantities  of  punched 
cards.  Each  handling  of  the  card  stacks  is  a  source  of  potential 
error  and  delay.  The  UNIVAC  memory  permits  the  simultaneous 
accumulation  of  the  580  tallies  which  describe  our  population 
for  each  local  area  being  studied  In  the  UNIV.\C  svsteni. 

Commercial  problems 

In  the  commercial  field,  the  UNIVAC  system  has  handled  premium 
billing  for  a  life  insurance  company.  This  program  produces  pre- 
mium notices,  dividends,  and  commissions.  In  a  particular  example 
worked  out,  approximately  1  .OOO.OOO  bills,  34(),()()()  dividends,  and 
1()(),()()()  commissions  have  to  be  produced  monthly.  The  necessary 
information  for  processing  a  particular  policy  is  contained  in  240 
digits,  or,  in  special  cases,  4S0.  This  compactness  is  made  possible 
bv  a  logical  system  of  40  symbols,  comprising  both  alphabetic  and 
numeric  characters,  which  denote  over  90  definitions.  The  UNI- 
VAC processes  the  policies  as  directed  by  the  symbols,  policy 
dates,  and  policy  numbers. 

The  problem  includes  inserting  over  250,000  changes  each 
month  before  fiu'ther  handling  is  done.  After  this  step,  the  policies 
to  be  processed  are  selected  from  a  file  of  1,.500,000  items.  Next, 
a  list  is  produced  of  the  cases  which  have  svnibols  indicating  that 
special  notices  must  be  sent  to  the  policyholders.  Following  the 
calculation  of  dividends  and  commissions,  additional  lists  are  pro- 
duced: one  group  contains  information  pertaining  to  commissions 
and  agents;  another  contains  information  regarding  dividends;  and 
finally,  there  is  a  listing  of  option  changes  for  later  insertion  into 
the  policy  files.  Policies  requiring  premium  notices  are  then  edited 
and  the  notices  are  automaticallv  printed  from  the  data  contained 
on  magnetic  tapes. 

The  UNIVAC  time  needed  for  a  program  of  this  proportion 
is  about  135  hours  a  month.  The  average  computer  time  per  policy 
processed  is  less  than  0.5  second.  The  average  time  for  all  change 


insertions,  printing,  calculations,  and  unityping  is  9  seconds  per 
item. 

Logistical  problems 

In  the  field  of  logistics,  five  major  studies  have  been  conducted, 
four  of  these  resulting  in  actual  problems  executed  on  the  com- 
puter. 

The  first  is  the  type  of  computation  in  which  the  basic  purpose 
is  to  determine  quantitively  whether  a  given  operational  or  mobi- 
lization plan  can  be  logistically  supported.  The  ultimate  desired 
is  to  find,  by  calculation,  the  optimum  program  for  carrying  out 
such  plans,  .^t  the  time  of  writing,  only  a  small  model  has  been 
actually  nm  on  UNIV.'VC,  but  full  size  models  will  be  nm  within 
the  next  few  weeks.  Two  computations  have  been  executed,  one 
a  set  of  three  tables  of  thousands  of  lines  each,  giving  a  detailed 
breakdown  of  machine  deployment,  fuel  requirements,  and  over- 
haul requirements.  The  other  problem  was  a  computation  of  the 
amounts  of  critical  raw  materials  required  to  construct  a  given 
number  of  each  type  of  equipment,  these  requirements  being 
phased  bv  t)uarters  over  a  2-year  period.  The  fourth  problem, 
which  was  actually  computed,  was  a  sample  of  a  similar  calcu- 
lation in  which  every  pound  of  critical  raw  material  required  each 
month  for  the  ultimate  construction  of  a  complete  building  pro- 
gram was  computed. 

The  UNIVAC  program  which  was  prepared  is  capable  of 
accommodating  every  type  of  equipment,  individually  tailored 
constniction  schedules,  detailed  bills  of  materials  rimning  into  the 
millions  of  items  and  of  determining  the  actual  amounts  of  alloy 
elements  ba.sed  on  thousands  of  tables  of  percentages  for  the  many 
alloys  employed.  The  demonstration  showed  that  this  computation 
for  400  pieces  of  equipment  of  a  given  type  could  be  executed 
in  three  hours  of  computer  time.  The  last  problem  in  this  field 
has  not  yet  been  run,  but  the  study  has  shown  that  the  entire 
gamut  of  stock  control  for  a  large  supply  office  can  be  covered 
bv  the  computer  in  approximately  3  weeks  time. 

This  program  involves  the  maintenance  of  stock  balances  of 
hundreds  of  thousands  of  stock  items  for  many  service  points  and 
provides  for  the  preparation  of  stock  transfer  orders,  purchase 
requisitions,  critical  lists  and  summarv  reports. 

Performance  record  of  the  UNIVAC 

Acceptance  tests 

The  .\cceptance  Tests,  prepared  jointly  by  the  Bureau  of  Standards 
and  Bureau  of  Census,  are  fidly  discussed  in  the  following  paper 
bv  Dr.  Alexander  and  Mr.  McPherson.^  However,  a  few  comments 
'Paper  not  included  in  this  book.  See  McPherson  and  Alexander  [1951]. 


166   Part  2  |  The  instruction-set  processor  main-line  computers 


Section  1  I  Processors  with  one  address  per  instruction 


concerning  them  from  the  engineering  point  of  view  are  appro- 
priate. 

The  Census  computer  was  given  two  tests;  the  first,  a  test  of 
its  computational  ability;  the  second,  a  test  of  its  input-output 
system  which  particularly  stressed  the  tape  reading  and  recording 
abilities. 

The  Central  Computer  Acceptance  Test  A  consisted  of  two 
parts.  During  Part  1,  every  available  internal  operation,  except 
input-output  operations,  was  performed,  .\mong  these  operations 
were  addition,  subtraction,  comparisons,  division,  and  three 
different  types  of  multiplication  operations.  Each  of  the  arith- 
metic operations  handled  a  pair  of  11-decimal  digit  quantities. 
Altogether  there  were  about  2,500  operations  in  the  routine,  vet 
the  entire  routine  required  only  1.26  seconds  to  do.  The  routine 
was  performed  808  times  in  17  minutes  making  a  total  of  about 
2,000,000  operations  in  all. 

The  second  part  of  Test  A  included  the  solution  of  a  heat 
distribution  equation,  a  short  routine  involving  the  input-output 
device  and  a  sorting  routine.  The  sorting  routine  arranged  ten 
numerical  quantities  each  containing  12  decimal  digits  in  correct 
numerical  order  in  about  0.2  second.  All  three  routines  took  a  total 
of  IVj  minutes  to  perform.  Thev  were  performed  twice  for  each 
test  and  when  added  to  Part  1  made  a  total  of  20  minutes  for 
unit  test  A. 

The  Acceptance  Test  B  examined  the  input-output  tape  devices 
(Uniservos).  During  the  first  part  of  Test  B,  2,000  blocks  or  about 
1.4  million  digits,  which  included  every  available  character 
(numeric  and  alphabetic)  were  recorded  on  a  tape  and  then  read 
back  into  the  computer  with  the  tape  moving  backward.  The 
information  read  back  was  then  compared  with  the  original  data 
read  out.  The  recording  operation  required  about  4  minutes  while 
reading  back  and  comparison  required  about  8  minutes.  The  sec- 
ond part  of  Test  B  consisted  of  recording  and  reading  over  one 
spot  of  tape  for  700  passes  in  order  to  determine  the  readabilit\' 
of  tape  as  it  wears.  This  test  required  1.3  minutes  and  when  com- 
bined with  Part  1,  made  a  total  of  approximately  25  minutes  for 
Test  B.  This  test  was  repeated  19  times. 

The  first  test  run  passed  in  6.6  hours  (minimum  theoretical 
time:  6.0  hours)  and  the  second  test  was  passed  in  9.47  hours 
(minimum  theoretical  time:  7.45  hours).  Of  the  2.02  hours  down 
time,  1.45  hours  were  accumulated  at  one  time  with  the  remaining 
0.58  hours  spread  over  the  rest  of  the  test. 

The  Uniprinter  test  required  that  a  block  of  information  (60 
words)  be  printed  200  times  in  tabular  form.  The  minimum  time 
for  printing  was  five  hours.  The  test  was  passed  in  6.16  hours. 

The  card-to-tape  test  required  that  ten  good  reels  of  tape  be 
produced  in  12  hours.  There  were  certain  restrictions  as  to  reading 


accuracy  and  other  criteria  of  reproducing  ability  which  defined 
"good"  reels.  In  10  hours,  the  converter  had  prepared  over  15  reels, 
14  reels  had  been  tested,  1 1  of  the  14  were  found  satisfactory  and 
the  converter  was  accepted  for  payment. 

.although  the  test  was  nm  on  only  one  of  two  converters,  the 
Bureau  of  Census  put  both  card-to-tape  machines  into  operation 
and  after  six  months  of  use,  the  acceptance  test  was  run  on  the 
second  card-to-tape  converter.  This  test  differed  to  some  extent 
from  the  first  test  in  that  the  Census  Bureau  was  satisfied  with 
the  reading  ability  of  the  machines  and  did  not  require  a  digit-by- 
digit  verification  of  the  information.  However,  a  new  stipulation 
was  added  that,  after  the  engineers  had  checked  the  converter 
out  preparatory  to  running  the  test,  the  converter  was  to  be  used 
in  actual  operation  for  eight  hours  before  doing  the  remainder  of 
the  test  with  no  engineering  intervention  between  the  two  portions 
of  the  test.  The  first  part  was  run  on  Friday,  October  5,  1951;  the 
device  remained  idle  Saturday  and  Sunday  and  was  turned  on 
Monday  morning  to  complete  the  test.  It  passed  with  flying  colors, 
preparing  ten  acceptable  reels  (out  of  ten  reels)  plus  two  decks 
of  check  cards  in  slightly  less  than  7  hours.  Both  card-to-tape 
converters  now  are  in  Washington  and  the  remainder  of  the  system 
is  in  operation  by  the  Bureau  of  the  Census  on  the  Eckert-Mauchly 
premises  in  Philadelphia. 

Reliability  and  factors  affecting  performance 

The  first  UNIVAC  system  now  has  been  operating  for  approxi- 
mately 8  months.  In  that  time,  much  has  been  learned  about  how 
UNIVACs  should  be  operated  and  maintained.  The  situation  has 
been  somewhat  complicated  by  having  to  shake  down  the  equip- 
ment while  in  the  customer's  possession;  that  is,  there  were  certain 
faults  in  the  system  from  both  engineering  and  production  stand- 
points which  could  only  become  apparent  in  the  course  of  time 
and  under  actual  operation  conditions.  For  example,  weak  tubes 
or  faulty  solder  joints  did  not  reveal  their  presence  at  the  time 
of  installation.  Another  type  of  difficulty  only  became  apparent 
under  certain  duty  cycle  conditions  imposed  by  various  types  of 
problems.  Because  only  certain  problems  present  this  particular 
duty  cycle,  these  troubles  remained  in  the  machine  causing  inter- 
mittent stoppages  until  they  could  be  tracked  down. 

Patient  isolation  and  elimination  of  such  problems,  most  of 
which  have  occurred  only  with  conditions  of  operation  infre- 
quently encountered,  is  a  powerful,  though  sometimes  painful 
proving  ground  for  the  engineering  group  charged  with  such  re- 
sponsibility. The  experience  and  depth  of  judgment  acquired  by 
such  a  group  in  the  course  of  performing  such  work  ha\  e  become 
unmistakably  apparent  in  the  already  noted  improved  performance 
of  following  UNIVACs  and  generally  advanced  ability  to  predict 


Chapter  8  j  The  UNIVAC  system  167 


and  realize  performance  in  any  large  scale  and  complex  apparatus 
of  the  same  character. 

Some  of  the  troubles  encountered  are  interesting  to  study  in 
detail.  On  a  rather  complicated  routine  requiring  the  use  of  a 
number  of  Uniservos,  all  ran  smoothly  for  15  minutes.  At  that  time, 
one  of  the  Uniservos  executing  a  backward  read  somewhere  in  the 
middle  of  the  reel,  did  not  stop  at  the  end  of  the  block  but  con- 
tinued to  nm  until  it  ran  off  the  end  of  the  tape,  .\fter  much  work, 
it  was  shown  that  a  cycling  unit  signal  was  being  overloaded 
because  it  was  being  used  both  by  a  multiplication  instruction  and 
the  backward  read  which  were  occurring  simultaneously.  The 
input  precessor  loop  was  cleared  as  a  result  and  the  count  of  the 
pulses  coming  off  the  tape  was  thereby  lost.  Once  the  trouble  was 
found,  it  was  simple  to  remedy. 

.\nother  rather  interesting  case  occurred  intermittently  over 
an  extended  period.  Normallv  when  reading  out  of  the  memory, 
the  contents  should  not  he  cleared.  Occasionally,  however,  reading 
from  the  memory  also  caused  the  contents  to  be  cleared.  .\s  the 
trouble  only  remained  for  a  period  of  seconds  or,  at  most,  a  few 
minutes,  it  was  somewhat  difficult  to  localize.  Of  course,  parasitic 
oscillations  of  some  sort  were  suspected  and,  in  fact,  the  trouble 
was  traced  to  the  actual  source  on  a  logical  basis;  but  the  source, 
a  high  power  cathode  follower,  showed  no  evidence  of  oscillation. 
Before  the  problem  was  remedied,  various  combinations  of  para- 
sitic suppressors  were  tried;  the  trouble  would  vanish  for  perhaps 
a  week  and  then  return.  The  oscillation  finally  cropped  up  during 
a  maintenance  shift,  was  found  to  be  in  the  suspect  tube  at  100 
megacycles  and  was  eliminated  rather  easih'. 

Other  types  of  troubles  that  have  occurred  include  intermittent 
parasitic  oscillations  in  other  circuits,  bounce  in  Uniservo  relav 
circuits,  various  mechanical  problems  in  Uniservos,  time  constants 
not  consistent  with  the  longest  duty  cycle  signals,  and  various 
types  of  noise  in  the  input  circuits.  The  tubes,  which  initially  were 
bothersome,  have  now  stabilized  to  the  point  where  two  tubes 
per  week  (on  the  average)  stop  the  computer  during  computation. 

All  of  the  above  troubles  and  others  not  discussed  here  have 
contributed  to  lost  computing  time  on  the  UNIVAC.  However, 
they  cannot  influence  future  operation  because  the  reasons  for 
them  have  been  found  and  eliminated.  The  fact  that  these  troubles 
will  not  occur  in  future  UNIV.A.Cs  cannot  be  emphasized  too 
strongly. 

Under  a  contract  with  the  Bureau  of  Census,  Eckert-Mauchly 
Computer  Corporation  maintains  the  Census  installation.  This 
system  is  operated  24  hours  a  day,  seven  days  a  week,  except  for 
four  8-hour  preventive  maintenance  shifts  each  week.  This  allows 
approximately  32  hours  for  regular  maintenance  and  136  hours 
for  operation  or  21  and  79  per  cent  respectively.  Table  1  shows 


the  engineering  time  spent  on  the  computer  system  during  typical 
weeks  of  operation.  The  figures  are  given  both  in  hours  and  per- 
centages. Both  nonscheduled  engineering  time  as  well  as  preven- 
tive maintenance  time  are  shown.  The  sum  of  the  two  gives  the 
total  engineering  time  spent  on  the  computer  per  week.  It  should 
be  noted  that  this  is  actual  engineering  time  and  does  not  include 
time  that  the  computer  may  have  been  shut  down  while  waiting 
for  an  engineer  to  report.  According  to  our  maintenance  contract, 
this  must  be  within  a  half  hour  during  regvilar  working  hours  and 
within  two  hours  at  all  other  times.  Attention  should  be  given  to 
the  fact  that  the  preventive  maintenance  time  does  not  total 
exactly  32  hours  each  week.  This  is  due  in  part  to  a  half-hour 
period  each  morning  devoted  to  checking  and  cleaning  the 
mechanical  portions  of  Uniservos.  It  is  expected  that  this  work 
will  be  taken  over  by  the  UNIVAC  operators  since  the  procedures 
and  the  techniques  involved  are  quite  simple. 

In  addition,  one  extra  shift  was  required  the  week  ending  June 
3  and  three  extra  shifts  the  week  ending  October  7,  19.51.  The.se 
shifts  were  required  to  incorporate  engineering  changes  which  had 
been  developed  over  a  period  of  time  and  could  not  be  incor- 
porated in  the  equipment  during  the  normal  preventive  main- 
Table  1 


Tulal 


Week 
rndins. 

Somchedttled 
rn^ineerini^ 
Hours    Per  Ci  nl 

Freventii'e 
maintenance 
Hours    Per  Cent 

erigin 
titne 
Hours 

eerin^ 
Per  Cent 

Percentage  of 
nonscheduled 
engineering 

June 

3 

18.9 

U.3 

40 

23.8 

58.9 

35.1 

14.8 

26 

20.5 

12.2 

34 

20.2 

54.5 

32 

15.3 

July 

14 

14.7 

8.8 

33 

19.6 

47.7 

28 

10.9 

21 

19.4 

11.6 

34.5 

20.5 

53.9 

32 

14.5 

28 

39.2 

23.3 

34.5 

20.5 

73.7 

43.8 

29.4 

Aug. 

4 

26.2 

15.6 

33 

19.6 

59.2 

35.2 

19.4 

Sept. 

2 

28.8 

17.1 

34.5 

20.5 

63.3 

37.7 

21.6 

9 

16.1 

9.6 

34.5 

20.5 

50.6 

30 

12.1 

16 

22.6 

13.5 

33 

19.6 

55.6 

33 

16.7 

23 

42.3 

25.2 

34.5 

20.5 

76.8 

45.7 

31.7 

30 

21.8 

13.0 

34.5 

20.5 

56.3 

33.5 

16.3 

Oct. 

7 

15.9 

9.5 

56 

33.3 

71.9 

42.8 

14.2 

14 

14.0 

8.3 

34.5 

20.5 

48.5 

28.9 

10.5 

21 

10.4 

6.2 

34.5 

20.5 

44.9 

26.7 

7.8 

28 

20.8 

12.4 

33 

19.6 

53.8 

32 

15.4 

Nov. 

4 

40.4 

24.0 

34.5 

20.5 

74.9 

44.6 

30.3 

11 

10.1 

6.0 

34.5 

20.5 

44.6 

26.5 

7.6 

18 

30.5 

18.2 

34.5 

20.5 

65 

38.7 

22 

25 

13.7 

8.2 

34.5 

20.5 

48 

28.6 

10 

Dec. 

2 

14.8 

8.7 

34.5 

20.5 

49.3 

29.3 

12.6 

9 

19.6 

11.7 

34.5 

20.5 

54.1 

32.2 

14.7 

Part  2     The  instruction-set  processor:  main-line  computers 


Section  1  |  Processors  with  one  address  per  instruction 


tenance  time.  The  nonscheduled  engineering  time  has  varied  from 
as  little  as  10.1  hours  or  6  per  cent  to  42.3  hours  or  25  per  cent. 
The  last  column  in  the  Table  shows  the  amount  of  nonscheduled 
engineering  time  as  compared  to  the  allowable  operating  time 
(total  time  less  preventive  maintenance  time).  Here  there  is  a 
variation  of  from  7.6  to  31.7  per  cent  and  an  average  for  the  weeks 
shown  of  16.9  per  cent.  It  is  believed  that  these  figures,  while  good 
for  the  first  months  of  operation  of  a  new  piece  of  equipment,  will 
show  definite  improvement  over  the  next  year. 

Although  the  opportunity  to  prove  or  disprove  the  following 
theory  of  operation  has  not  presented  itself,  it  is  believed  logical 
that  optimum  use  of  the  UNIVAC  equipment  might  be  obtained 
by  means  of  scheduling  preventive  maintenance  only  at  such  times 
as  it  is  indicated  in  the  judgment  of  competent  operators.  In  other 
words,  there  are  many  occasions  preceding  a  scheduled  main- 
tenance shift  when  the  system  is  performing  very  well.  At  such 
times,  it  is  extremely  inefficient  to  shut  down  the  operation  in 
order  to  provide  maintenance.  For  many  reasons,  however,  it  has 
been  impossible  to  operate  and  maintain  the  first  system  in  this 
way.  It  is  hoped  that  such  operation  will  be  possible  in  following 
installations. 

It  should  be  realized  that  the  UNIVAC  system  requires  a  super- 
visor of  the  same  caliber  as  the  one  required  for  a  large  punched 
card  installation.  However,  the  large  group  of  operating  personnel 
would  be  replaced  by  a  small  group  of  well-trained  extremely 
competent  people  thoroughly  familiar  with  the  details  of  the 
computer  and  associated  equipment.  The  time  spent  in  providing 
a  high  degree  of  training  for  these  people  is  more  than  repaid  in 
increased  operating  efficiency  and  consequently  higher  work  out- 
put. For  example,  situations  arise  in  the  course  of  running  a  prob- 
lem where  a  correct  operational  decision  can  save  hours  of  elapsed 
computation.  Also,  a  competent  operator  will  recognize  malfunc- 
tions sufficiently  early  to  prevent  serious  delays.  He  is  capable  of 
deciding  whether  to  continue  with  machine  operation  or  to  stop 
to  diagnose.  The  second  UNIVAC  system  which  is  ready  for 
installation  in  Washington,  will  be  operated  by  a  group  of  engi- 
neers who  have  been  trained  in  operation  and  maintenance.  This 
procedure,  it  is  believed,  will  result  in  the  UNIVAC  system  being 
of  maximum  benefit  to  the  Air  Comptroller's  Office. 

Evaluation  of  UNIVAC  design 

Checking  features 

Maintenance  of  the  UNIVAC  has  been  vastly  simplified  by  use 
of  duplicate  arithmetic  and  control  equipment  and  other  checking 
methods.  Many  factors  which  would  have  led  to  undetected  errors 


have,  by  virtue  of  duplication,  immediately  stopped  the  computer. 
Although  checking  by  means  of  inverse  operations  can  provide 
operational  checks  on  the  arithmetic  circuits,  there  is  some  ques- 
tion as  to  whether  it  provides  as  good  a  check  as  duplication. 
However,  in  connection  with  odd-even  codes,  it  may  conceivably 
be  comparable.  It  .should  be  remembered,  however,  that  this  is 
from  an  operational  standpoint  and  not  a  maintenance  standpoint. 
When  the  control  equipment  is  considered  it  is  difficult  to  visualize 
a  check  that  is  as  good  as  duplicated  equipment.  Other  checks 
that  are  utilized  in  UNIVAC  include  the  periodic  memory  check, 
intermediate  line  function  table  checker,  function  table  output 
checker,  memory  switch  checker,  and  720  checker. 

As  explained  earlier  in  the  paper,  the  periodic  memory  check 
is  accomplished  by  reading  out  of  all  memory  channels  sequen- 
tially and  performing  an  odd-even  check  on  each  digit  as  it  passes 
through  the  high  speed  bus  amplifier.  The  period  at  which  the 
check  is  repeated  may  be  varied  over  a  large  interval.  At  present, 
it  is  set  at  5  seconds,  the  check  taking  52  milliseconds  or  about 
1  per  cent  of  the  computing  time. 

The  function  table  has  a  check  at  the  very  input  by  bringing 
in  the  check  pulse  in  each  character  so  that  if  an  odd-even  error 
occurs  between  the  control  register  and  the  static  register,  no  order 
will  be  set  up  and  the  computer  will  grind  to  a  halt!  If  the  input 
sets  up  properly  but  an  error  occurs  farther  on  in  the  table,  but 
not  ahead  of  the  intermediate  lines  (the  linear  set  into  which  the 
input  combinations  are  decoded),  the  error  is  caught  at  this  point. 
The  intermediate  lines  are  broken  into  groups  in  such  a  way  that 
an  error  is  indicated  when  more  than  one  line  is  set  up  in  one 
group  or  the  entire  set.  There  is  an  exception  to  this  in  some  groups 
where  no  error  is  indicated  by  this  checker  if  more  than  one  line 
is  set  up  within  the  group. 

This  has  been  allowed  only  in  those  cases  where  it  has  been 
shown  that  setting  up  two  or  more  lines  will  cause  some  other 
checker  or  checkers  to  indicate  the  trouble. 

If  the  error  occurs  beyond  the  intermediate  lines,  the  output 
checker  then  comes  into  play.  This  checker  makes  an  odd-even 
count  on  the  number  of  gates  used  on  each  instruction:  dummy 
lines  having  been  added  so  that  the  count  is  normallv  alwavs  odd. 

The  memory  switch  or  tank  selector  checker  ensures  that  one 
and  only  one  memory  channel  is  selected  on  any  instruction.  It 
checks  each  of  the  two  digit  positions  separately  indicating  which 
if  either,  is  in  error. 

The  720  checker  counts  the  digits  coming  off  the  tape  and  if 
there  are  either  more  or  less  than  720  in  one  block,  the  computer 
stops;  by  examining  the  indicators  on  the  supervisory  control 
console,  the  operator  can  determine  the  number  of  digits  actually 


Chapter  8  ;  The  UNIVAC  system  169 


read.  By  means  of  some  rather  simple  manipulations,  the  operator 
can  then  reread  the  block  without  losing  his  place  in  the  routine; 
and  if  the  information  is  then  read  correctly,  he  mav  again  start 
the  computer  on  the  routine.  The  same  procedure  mav  be  followed 
if  an  odd-even  error  is  made  in  reading  from  the  tape. 

Many  checks  other  than  those  mentioned  before  have  been 
built  into  the  UNIVAC.  On  the  basis  of  operating  e.xperience,  the 
engineers  cannot  recommend  too  strongly  the  use  of  built-in 
checking  facilities.  All  in  all,  the  faith  that  can  be  put  info  results 
obtained  from  an  unchecked  computer  comparable  in  size  to 
UNIVAC  is  in  the  writers'  opinion  exceedinglv  low. 

More  than  this,  however,  the  methods  bv  which  the  UNIV.\C 
is  checked  have  been  of  extreme  usefulness  in  trouble  shooting. 
The  duplication  of  circuits  has  amply  repaid  the  increase  of  space 
and  the  number  of  components  required  by  this  checking  system. 

General  comments 

After  evaluating  UNIVAC  performance  over  a  period  of  eight 
months,  the  over-all  picture  of  the  UNIV.\C  design,  in  the  minds 


of  its  designers,  is  extremely  good.  Certain  phases  of  its  design 
exceeded  expectations,  while  of  course,  other  phases  were  some- 
what disappointing.  The  first  eight  months  of  actual  operation 
have  taught  more  than  vears  of  experimentation  with  laboratory 
models.  Many  improvements  have  already  been  conceived  of  this 
experience  and  are  continuing  daily  to  increase  reliability. 

The  other  major  factor  influencing  computer  design,  cost,  has 
been  duly  considered  in  the  UNIVAC  design;  and  it  is  being  met 
with  plans  for  a  continuing  full-scale  production  of  UNIVAC  sys- 
tems. As  the  production  techniques  are  developed  concurrently 
with  the  engineering  design  details,  the  UNIV.\C  becomes  the 
realization  of  a  hope  which  has  long  been  in  the  minds  of  its 
designers:  An  economical,  completely  reliable  commercial  com- 
puter for  performing  the  routine  mental  work  of  the  world  much 
as  automatic  machinery  has  taken  over  the  routine  mechanical 
work  of  the  manufacturer. 

References 

\IcPhJ.51. 


V 


Section  2 

Processors  with  a  general  register 
state 

The  processors  described  in  this  section  all  have  a  processor 
state  consisting  of  registers  which  are  used  for  multiple  (i.e., 
general)  purposes.  Perhaps  a  better  name  might  be  processors 
with  a  state  consisting  of  a  register  array(s).  The  following 
machines  are  fairly  similar  in  their  ISP  structure:  Pegasus 
(Chap.  9),  the  DEC  PDP-6,10,  the  SDS  Sigma  5  and  7,  and 
the  UNIVAC  1107  and  1108.  However,  other  computers  includ- 
ing an  8-bit  character  computer  (Chap.  10)  and  the  CDC  6600 
(Chap.  39)  also  use  arrays  of  registers. 

The  general  register  organization  appears  as  a  compromise 
between  the  1  and  2  address  organizations.  It  avoids  some  of 
the  extra  instructions  for  shuffling  data,  inherent  in  a  1  address 
system,  but  avoids  taking  the  space  for  a  full  additional  address. 
The  index  register  organization  is  also  such  a  compromise,  but 
one  that  is  specialized  to  address  calculations.  The  general 
register  organization  moves  further  toward  a  full  2  address 
organization  without  much  additional  cost.  This  assumes  a 
small  relative  cost  for  a  small  amount  of  memory  that  is  sig- 
nificantly faster  than  the  larger  Mp. 

The  design  philosophy  of  Pegasus, 
a  quantity-production  computer 

Chapter  9  describes  Pegasus's  logical  organization  and  the 
technology  from  which  it  was  implemented.  The  technology 
includes  vacuum  tubes,  a  cyclic  memory,  and  dynamic  logic 
based  on  delay  lines.  Pegasus  has  the  nicest  ISP  processor 
structure  discussed  in  this  section— perhaps  in  the  book.  It  is 
included  because  it  is  probably  the  first  machine  to  use  an  array 
of  general  registers  as  accumulators,  multiplier-quotient  regis- 
ters, index  registers,  etc.  This  ISP  organization  should  be  com- 
pared with  the  IBM  System/360  (Chap.  43).  Note  that  the 


multiple-register  organization  is  independent  of  Mp. cyclic.  This 
organization  improves  performance  by  generality. 

The  structure  of  System /360 

Part  I— outline  of  the  logical  structure 

The  IBM  System/360  is  described  in  Part  6,  Sec.  3,  and  is 
included  mainly  because  of  the  very  large  number  of  such 
systems  that  have  been  built. 


An  8-bit-character  computer 

This  computer  (Chap.  10)  has  been  invented  by  the  authors  to 
show  the  composite  features  of  a  small  character/word-oriented 
computer.  In  reality,  8-bit  machines  turn  out  to  look  either  like 
16-bit  machines,  because  the  Mp  size  accessed  is  usually  >2^ 
words,  or  like  character-string  processors.  Because  of  the 
primitive  nature  of  this  machine,  it  is  a  possible  alternative  to 
the  larger  more  complex  microprogrammed  processors  for 
defining  more  complex  ISP's. 

Parallel  operation  in  the  Control  Data  6600 

The  CDC  6600,  described  in  Chap.  39,  has  three  arrays  of  eight 
registers  each.  Two  of  the  arrays  are  used  rather  generally,  and 
the  third  array  is  used  to  access  words  in  Mp.  The  design  of 
the  CDC  6600  is  a  classic  because  of  the  computing  power  it 
provides.  It  is  also  worth  studying  as  an  example  of  a  Pc 
assigned  exclusively  to  data  operation,  with  all  concern  with  the 
larger  PMS  structure  located  in  Pio's.  A  discussion  of  it  is  given 
in  Part  5,  Sec.  4,  page  470. 


170 


Chapter  9 


The  design  philosophy  of  Pegasus, 
a  quantity-production  computer^ 

W.  S.  Elliott  /  C.  E.  Oiven  /  C.  H.  Devonald 
B.  G.  Mauclsleij 

Summary  The  paper  gives  an  historical  account  of  the  development  of 
the  packaged  method  of  construction  of  computers,  and  the  advantages 
of  this  method  are  discussed.  The  packages  used  in  the  computer  Pegasus 
are  described  from  both  an  electronic  and  a  mechanical  point  of  view.  The 
specification  of  the  machine  is  given  and  the  arguments  which  led  to  this 
specification  are  discussed.  The  detailed  logical  design  procedure  leading 
from  the  specification  to  the  wiring  lists  is  described.  The  method  of 
maintenance  and  some  reliabilitv  figures  are  given. 

Introduction 

The  development  of  standard  plug-in  unit  circuits  ('packages")  for 
digital  computers  began  in  this  country  [England]  in  1947,  and 
some  of  the  advantages  of  the  method  have  been  discussed  in 
earlier  papers  [Elliott,  19.51;  Johnston,  19.52;  Elliott  et  al.,  19.52; 
Elliott  et  al.,  195.3].  The  advantages  start  in  the  design  stage  of 
a  new  computer  project  and  follow  through  production  and  com- 
mi.ssioning  to  maintenance. 

In  the  design  stage,  what  is  known  as  "logical"  design  is  sepa- 
rated from  engineering  design.  Once  the  packages  have  been 
designed  by  electronic  engineers  and  the  rules  for  their  inter- 
connection have  been  laid  down,  the  "logical  designers'  (usually, 
but  not  necessarily,  mathematicians!  can  begin  organizing  the 
packages  into  various  computers  to  carry  out  different  functional 
requirements.  The  electronic  and  mechanical  design  work  invested 
in  the  packages  is  thus  drawn  on  for  more  than  one  computer 
design,  and  each  computer  can  be  assembled  from  stock  parts 
without  further  engineering  effort.  Design  time  and  cost  are  there- 
fore much  reduced. 

In  production,  whether  we  consider  one  design  of  computer 
or  several  designs  using  the  same  packages,  costs  and  time  are  also 
much  reduced.  Quantity  production  lines  for  the  relatively  few 
types  of  standard  package  are  set  up,  and  are  common  to  different 
computer  designs,  thus  reducing  inspection  and  planning  costs. 
Standard  cabinet  work  has  been  designed  for  Pegasus,  and  this 

'Prof.  lEE.  pt.  B,  vol.  10.3.  supp.  2,  pp.  188-196,  19.56. 


too  can  be  taken  from  stock  or  established  production  lines  to  make 
other  computers. 

In  commissioning  a  computer,  because  all  the  packages  have 
been  pretested,  when  power  is  first  applied  to  the  complete 
machine  it  is  known  that  a  large  part  is  alread)'  fault-free.  It 
remains  to  detect  a  few  errors  which  may  have  been  made  in  the 
interconnections. 

Perhaps  an  even  more  important  consideration  is  ease  and 
speed  of  maintenance.  Test  programmes  will  usually  indicate  the 
part  of  the  machine  in  which  a  fault  is  occurring.  Several  monitor 
sockets  are  located  on  the  front  of  each  package,  and  bv  inspection 
the  faulty  package  is  speedily  foimd  and  replaced. 

The  package  method  has  been  criticized  on  the  groimds  of  the 
cost  and  questionable  reliability  of  plugs  and  sockets,  and  some 
redundancy  of  components. 

The  authors  believe  that  the  many  advantages  far  outweigh 
the  cost  of  plugs  and  sockets.  The  present  trend  is  to  use  copper- 
etched  printed  circuits,  and  these  fall  naturally  into  the  plug-in 
unit  idea,  the  plug  contacts  being  part  of  the  printed  wiring;  there 
has  been  no  trouble  in  Pega-sus  from  plugs  and  sockets.  Component 
redundancN'  in  Pegasus  is  about  10%  of  the  diodes  and  a  few 
resistors,  the  cost  of  redundant  components  being  about  £1.50. 

Electrical  design  of  the  packages 

Circuits  used  for  arithmetic  and  switching  operations 

Historical.  A  previous  data-processing  machine  [Elliott  et  al.. 
19.52;  Elliott  et  al.,  1956b]  used  .3.30  kc,  s  serial-digital  circuits:  they 
had  originally  been  designed  for  1  Mc/s  operation,  but  330  kc  s 
was  chosen  to  suit  an  anticipation-pulse  cathode-ray- tube  store.  This 
frequency  has  been  retained  to  the  present  time  because  it  suits 
the  magnetostriction  delay-line  store  [Fairclough.  19.56]  and  the 
magnetic-dnmi  store  [Merry  and  Maudslev,  19.56],  Experience 
with  the  data  processor  led  to  work  (commenced  in  1951)  on  a 
new  set  of  circuits  [Elliott  et  al.,  19.52],  particular  emphasis  being 


172  Part  2     The  instruction-set  processor:  main  line  computers 


Section  2  j  Processors  with  a  general  register  state 


laid  on  flexibility  of  use  and  ability  to  work  without  error  in  high 
electrical  interference  fields.  These  circuits  form  the  basis  of  those 
in  Pegasus. 

Operations  to  be  carried  out.  The  following  well-known  opera- 
tions are  used  to  build  up  the  logical  structure  of  the  computer: 

a  'And.'  This  operation,  which  may  be  carried  out  between 
two  or  more  input  serial  trains  of  pulses,  produces  an  output 
train  in  which  pulses  occur  only  when  pulses  are  present 
at  the  same  time  on  all  inputs. 

b  'Or.'  This  operation  produces  an  output  train  in  which 
pulses  occur  at  all  times  when  a  pulse  is  present  on  any 
of  a  number  of  inputs. 

c  'Not.'  I  s  are  changed  into  O's  and  O's  into  I's;  this  is 
achieved  by  inverting  the  pulse  train. 

d  Digit  Delay.  The  passing  of  a  pulse  train  through  a  digit 
delay  produces  a  pulse  train  similar  to  the  input,  but  each 
pulse  is  one  pulse  position  later  in  timing  and  restandard- 
ized  in  shape. 

All  operations  in  the  computer,  including  addition,  subtraction, 
and  staticizing,  are  carried  out  by  combinations  of  these  elements. 
There  is  no  circuit  specifically  for  addition,  and  there  are,  in 
general,  no  flip-flops  such  as  are  often  used  for  staticizing  or  storing 
a  single  digit.  A  similar  philosophy  was  arrived  at  independently 
by  the  designers  of  SEAC  and  DYSEAC  [Elboume  and  Witt,  195.3], 
but  the  detailed  working  out  is  considerably  different. 

Digit  waveforms.  The  timing  of  digit  pulses  throughout  the  ma- 
chine is  controlled  by  a  common  'clock"  waveform — a  .3  micro- 
sec  square  wave  (Fig.  la)  in  which  the  positive-going  portions 
define  digit  positions. 

The  digit  pulses,  which  are  routed  about  the  machine  and  ap- 
plied to  logical  circuits,  are  generally  of  the  form  shown  in  Fig. 
lb;  as  generated,  they  have  their  leading  edges  well  in  advance 
of  the  clock  pulse  and  are  of  a  greater  amplitude.  This  means  that 
considerable  distortion  of  the  pulse  is  tolerable,  since  only  the 
portion  which  coincides  with  positive  clock  pulse  is  of  conse- 
quence. Digit  pulse  trains  are  'clocked'  ('and'  operation  with  clock) 
only  at  their  entry  into  a  storage  system  or  into  a  digit-delay 
circuit. 

Inverted  pulses  are  also  employed:  as  an  illustration,  consider 
the  operation  'A  and  not  B'.  Pulses  A  and  B  (Fig.  1)  are  on  two 
lines  and  are  of  the  same  nominal  timing,  and  we  wish  to  form 
A  .  B  (symbolic  representation  of  'A  and  not  B').  To  do  this  pulse 


B  is  inverted  (forming  B,  or  'not  B')  and  is  used  to  gate  pulse  A 
and  prevent  its  passage.  The  inverted  pulse  B  will  be  a  little  late 
on  B,  which  also  may  have  been  later  than  A,  as  shown  in  Fig. 
Ic;  thus  when  A  and  B  are  'anded'  together  a  spike  may  be  pro- 
duced, as  shown  in  Fig.  le.  This  spike,  however,  lies  between  clock 
pulses  and  so  will  be  rejected  on  clocking. 

The  pulse  system  used  allows  several  logical  operations  to  be 
performed  in  cascade  without  any  loss  in  nominal  timing,  so  easing 
the  problem  of  logical  design  (particularly  by  permitting  after- 
thoughts). The  maximum  number  of  logical  operations  performed 


3  ju  sec 


+  2  to  +  3  volts 

io) 


1.5/;  sec 


-10  to  -11  volts 
1 13  volts  minimum 

(A) 


-10  volts 


+  13  volt  minimum 


id) 


-10  volts 


-10  volts 


Fig.  1.  Basic  waveforms. 


Chapter  9  |  The  design  philosophy  of  Pegasus,  a  quantity-production  computer  173 


+  200 

volts 

+  200 

volts  H 

200  volts 

i  070-kA 

it 

Input  clock 
+  200  wolfs   +200  volts 

I...  I 

ZZkS\ 

4  7  ^ 

/  Clocked^ 
.'Digit  troin 

L  fc  ic 
4mH6       T  , 
N         1           1  . 

J  "l          1  ■ 

1   -20  volts  - 
-10  volts 

30 
it  1 

02 

-H- 

 .Output  1 

Output  2 
Lood  pins 

 E         IV  kA 

^OkA 

120 
kil. 

:  75  i  PI  75 
k,af  Ljkii, 

Input 

put  / 

l>)  Res 

■t  -1 

vo 

50  -150--150 

tS     volts  volts 

: 

. 

1  680  k/l 

Cloc 

k 

(0 

-150 
uolts 

) 

Fig.  2.  Digit-delay  circuit. 


in  cascade  in  Pegasus  is  five,  though  up  to  12  could  be  performed 
in  special  circiunstances. 

The  hgiial  ciiriiits.  Each  of  the  logical  packages  has  more  than 
one  circuit  unit.  A  circuit  unit  is  defined  as  that  part  of  a  package 
which  has  input  and  output  pins,  and  no  connections  to  other  parts 
of  the  package  other  than  supplies.  We  may  make  the  following 
generalizations: 

a    Each  unit  has  an  'and'  gate  at  its  input. 

/)    Each  unit  has  a  cathode-follower  output  (half  a  12.\TT 
valve). 

c    Each  unit  has  an  additional  output  via  a  germanium  diode 
for  making  "or'  gate  connections. 

[Note:  There  are  exceptions  to  (a)  and  (c)  on  one  package  type.] 

There  are  three  po.ssibilities  for  the  part  of  the  circuit  unit 
between  the  input  'and'  gate  and  the  output  cathode-follower, 
namely  a  digit  delay  (half  a  12AT7  valve),  an  inverter  (half  a 
12AT7  valve),  and  a  direct  connection.  Space  does  not  permit  a 
description  of  all  the  circuits,  so  it  is  proposed  to  deal  only  with 
the  digit  delav. 

The  circuit  is  shown  in  Fig.  2,  and  some  typical  waveforms 
are  shown  in  Fig.  .3.  The  input  circuit  can  be  of  two  forms,  namely 
a  3-input  'and'  gate  and  two  such  gates  with  their  outputs  "or-ed" 
together.  In  both  cases  there  is  a  further  gating  with  a  clock  pulse. 
The  clocked  digits  from  the  gate  input  circuit  are  applied  to  the 
grid  of  Vj,  the  anode  voltage  of  which  falls,  so  building  up  a 


current  in  L.  When  V'j  is  cut  off  at  the  end  of  the  digit,  this  current 
flows  through  diodes  Dj  and  charges  up  a  storage  condenser,  C, 
which  is  discharged  at  the  end  of  the  ne.\t  clock  pulse  by  a  'reset' 
pulse  applied  through  Do.  The  reset  pulse  supply  is  a  common 
computer  supply  whose  amplitude  and  phasing  relative  to  the 
clock  pulse  is  shown  in  Fig.  3. 

It  will  be  noted  that  the  reset  pulse  is  also  present  at  a  time, 
just  after  Vj  is  tut  off,  when  the  current  in  the  inductor  is  about 
to  charge  the  storage  condenser.  This  merely  has  the  effect  of 
deferring  the  charging  of  C  until  the  end  of  the  reset  pulse,  the 


Fig.  3.  Digit-delay  waveforms. 


-H       h-       sec  g  1^ 

-n-rLrLrmjLrt.;!, 


-  0.6  fj  sec 


^  -(-18  volts 

1.6) 

20  to  -25  volts 


_n/i_ri 


-20  volts 


-70  volts  ^  J 
approximate 


-10  volts  <.c) 
■  20  volts  approximate 

'  Id) 


-20  volts 


-10  volts  ■ 


-I-  20  approximate 


+  20  volts  approximote 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  2  |  Processors  with  a  general  register  state 


current  in  the  meantime  continuing  to  flow  through  the  diodes 
with  little  loss  in  the  stored  energy  of  L,  since  the  voltage  across 
L  is  low  at  this  time. 

The  output  cathode-follower  V,  is  caught  at  —  10  volts  in  the 
negative  direction  by  a  diode;  this  safeguards  the  crystal-diode 
circuits  driven  bv  it  in  the  event  of  failure  of  the  h.t.  supply  or 
Vo,  and  it  removes  residual  ripple  on  the  bottom  of  the  input 
waveform,  and  thus  reduces  the  back  voltage  and  hence  leakage 
in  diodes  of  gates  driven  by  the  output. 

The  second  output  through  a  diode  can  be  used  in  conjunction 
with  similar  outputs  from  other  circuits  and  a  resistor  (pins  .3  and 
4)  to  make  an  'or"  (up  to  about  16-way). 

In  general,  each  output  circuit  has  two  available  load  resistors, 
disposed  between  direct  and  'or'  outputs  according  to  a  set  of  rules 
which  are  applied  for  each  case.  The  number  of  units  which  can 
be  driven  by  an  output  can  vary  between  three  and  16  according 
to  circumstances;  where  more  have  to  be  driven  than  the  rules 
allow,  use  is  made  of  'booster'  cathode-followers  available  on  one 
of  the  packages. 

Some  examples  of  the  use  of  the  logical  circuits 

Two  examples  will  be  given,  the  first  being  a  simple  arrange- 
ment— the  staticizor — which  is  used  frequently,  and  the  second 
being  a  complicated  arrangement — the  adder/subtracter — which 
is  used  infrequently.  The  symbols  used  to  indicate  the  circuit  units 
are  shown  in  Figs.  2c  and  5b. 


L  J.—' 


X  -e  Y  or  X  -  Y 
(Delayed  one 
digit  ■ 


Carry  Add  Subtract 
suppression 

^      ^  0  Q 

''          Cathode  Inverter  Digit 


AND  Gate  foUo^er 


delay 


(A) 


Fig.  5.  The  adder/subtracter. 


It  is  normallv  turned  off  by  an  inverted  pulse  (a  '0'  following  a 
series  of  I  s)  on  one  of  the  gate  2  inputs. 


The  staticizor.  The  fimction  of  a  staticizor  is  to  remember  the 
fact  that  a  digit  occurred  at  a  particular  time,  for  an  indefinite 
period,  the  method  generally  used  in  Pegasus  being  shown  in  Fig. 
4.  A  digit  delay  with  a  twin  'and'  gate  input  has  its  output  con- 
nected to  one  of  its  inputs.  It  is  turned  on  by  gate  1,  which  causes 
a  digit  to  circulate  as  long  as  the  inputs  to  gate  2  remain  positive. 


Staticizor  is  set 
these  leads  ore 
positive 


Stoticizor  is  turned 
/  ott  it  either  of  these 
eods  is  negative 


Fig.  4.  The  staticizor. 


The  adder/subtracter.  Figure  5  shows  an  adder/ subtracter  unit 
with  inputs  A'  and  Y  and  an  output  X  +  Y  for  the  sum  or  A'  —  Y 
for  the  difference.  There  are  two  further  input  control  leads 
marked  'add'  and  'subtract".  If  the  'add'  lead  is  held  positive 
while  the  'subtract'  lead  is  held  negative,  the  unit  acts  as  an  adder. 
If  the  'subtract'  lead  is  held  positive  and  the  'add'  lead  negative, 
the  unit  acts  as  a  subtracter.  Carry  suppression  is  controlled  by 
the  lead  marked  "carrv  suppression'.  Carries  are  allowed  to  propa- 
gate when  this  lead  is  held  positive,  so  that  a  negative  signal  on 
this  lead  will  suppress  carry. 

Table  1  gives  the  digits  appearing  at  the  outputs  of  logical 
elements  in  the  adder/subtracter  unit  for  all  combinations  of  input 
and  carrv  digits  when  the  unit  is  operating  as  an  adder. 

Arrangement  of  circuits  based  on  packages 

It  was  required  to  base  the  logical  circuits  on  a  standard  size  of 
package  which  could  also  be  used  for  other  circuits,  e.g.  a  nickel- 
line  1-word  store  [Fairclough,  1956].  A  unit  which  could  accom- 
modate three  valves  and  had  a  .32-way  phig  was  decided  on;  the 


Chapter  9  j  The  design  philosophy  of  Pegasus,  a  quantity-production  computer  175 


Table  1  Digits  at  various  Internal  points  of  the  adder 'subtracter  unit 
when  set  to  add,  for  all  combinations  of  the  input  and  carry  digits 


Present 

Digits  at  internal  points 

Inputs  digiK 

carry 

digit 

A 

g 

c 

E 

F 

(Si/m) 

{Next 

A 

r 

7. 

ramj) 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

0 

1 

1 

0 

0 

1 

0 

1 

0 

1 

1 

0 

0 

1 

1 

0 

1 

0 

1 

0 

1 

0 

0 

1 

0 

1 

0 

1 

1 

0 

1 

0 

0 

1 

1 

1 

1 

1 

1 

0 

0 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

1 

1 

Note  — A  and  C  are  at  the  grids  of  the  digit  delay  units. 

prohleni  then  was  to  arrange  the  various  circuits  in  such  a  wav 
as  to  enable  a  computer  to  be  designed  using  a  minimum  total 
number  of  packages  without  too  manv  types.  Five  types  were 
arrived  at  and  these  are  shown  in  Fig.  6. 

As  an  example  of  the  factors  involved,  consider  package  types 


1  and  2.  The  circuit  units  based  on  package  tvpe  1  can  perform 
all  the  fimctions  of  those  on  type  2.  However,  there  are  manv  uses 
for  a  digit-delay  circuit  with  a  single  'and'  gate  input  (package 
type  2),  and  since  three  units  of  this  kind  (instead  of  two  for  a 
2-  'and'-gate  input  delay)  can  be  based  on  one  package,  a  saving 
can  be  effected.  In  Pegasus  this  saving  amounts  to  32  packages, 
which  is  considered  to  be  well  worth  an  extra  package  type. 

In  addition  to  the  five  logical  packages,  a  further  16  types  (three 
of  which  are  peculiar  to  each  computer)  are  required.  The  numbers 
used  for  the  various  hmctions  are  given  below: 

\iimlK'r 


Type  1   

Type  2   

Logical  types   Type  3   

Type  4   

Type  8   

Nickel  line  l-word  store   

Drum-store  packages  (8  types)  

Input  output  packages  (3  types)  .  .  .  . 
Clock  and  reset  waveforms  (3  types) . 
Total 


113 
64 
55 
45 
37 
61 
38 
17 
14 

444 


MOTE  .  Clock  connections 
ore  not  shown  ,  they  ore 
implied  whenever  a  delay 
symbol  is  used 


CE>^  CD^  OD^ 


CEH=  CEF=  CS^  0>- 


Fig.  6.  Contents  of  logical  packages.  The  arrowhead  on  an  output  lead  denotes  the  presence  of  an  OR  crystal  connection. 


176  Part  2  I  The  instruction-set  processor:  main  line  computers  Section  2     Processors  with  a  general  register  state 


The  magnetic-drum  store  and  the  circuit  packages  used  with 
it  are  described  in  another  paper  [Merry  and  Maudsley,  1956], 
as  is  the  nickel-hne  store  [Fairclough,  1956]. 

The  mechanical  design  of  the  packages 

General  form 

Each  standard  package  consists  of  three  main  parts,  namely  the 
valve  panel,  the  component  panel  and  the  plug. 

The  valve  panel  is  an  aluminium  pressing,  there  being  three 
types — a  3-valve  type,  a  2-valve  type  and  a  blank.  The  package 
type  number  is  marked  on  the  panel  by  two  dots  according  to 
the  standard  resistor  colour  code. 

The  component  panel  houses  up  to  100  components,  inchiding 
small  transformers,  chokes  and  coils,  the  panel  and  the  handle 
being  made  in  one  piece  from  sheet  insulating  material.  This 
design  provides  a  minimum  resistance  to  airflow  over  the  valves 
and  gives  ample  protection  to  the  valves  against  accidental  dam- 
age. 

The  plugs  and  sockets  are  used  in  multiples  of  eight  connec- 
tions. Most  of  the  packages  have  four  plugs  providing  32  connec- 
tions, but  up  to  64  are  possible  in  each  package.  The  plug  contacts 
are  made  of  brass  and  are  heavilv  silver-plated.  The  socket  uses 
a  proprietary  valve-holder  contact,  which  can  readily  be  replaced 
if  damaged. 


SOCKETS 


w 


Fig.  7.  Standard  package. 


This  combination  of  plug  and  socket  has  a  consistently  low 
contact  resistance  (0.003  ohm  at  1  amp);  the  insertion  and  with- 
drawal force  is  about  4  oz  per  contact. 

The  wiring  of  the  packages 

At  present  packages  are  wired  and  soldered  by  hand.  The  wiring 
is  point-to-point,  and  within  the  limitations  of  layout  for  efficient 
performance,  wire  lengths  are  standardized  for  mass  production  on 
automatic  wire-cutting  and  stripping  machines.  The  svmmetry  of 
the  evelet  positions  makes  it  possible  to  use  components  which 
are  preformed  to  a  standard  pitch  and  would  allow  for  automatic 
preforming  and  insertion  of  components. 

Experimental  packages  have  been  produced  bv  photo-etched 
wiring  and  dip  soldering. 

Specification  of  the  computer  Pegasus 
Summary  specification 

A  detailed  specification  would  cover  the  ground  of  the  program- 
ming manual  [Pegasus  Programming  Manual,  Ferranti  Ltd., 
London]  and  would  be  out  of  place  here. 

Pegasus  is  a  binary  serial-digital  computer.  The  word  length 
is  42  binary  digits,  of  which  39  digits  are  used  for  a  number  and 
its  sign  (negative  numbers  are  represented  by  their  complements 
with  respect  to  two),  one  digit  is  used  for  a  parity  check  and  the 
other  two  are  gap  digits.  The  length  of  an  order  is  19  binary  digits, 
so  that  one  word  may  consist  of  two  orders,  the  remaining  digit 
being  a  'stop-go'  digit.  If  the  'stop-go'  digit  is  a  '0',  the  computer 
will  stop  before  obeying  the  orders  in  the  word,  but  will  proceed 
unhindered  if  the  digit  is  a  T. 

There  is  a  2-level  store,  a  magnetic  drum  holding  5120  words 
and  an  immediate-access  or  computing  store  of  55  single-word 
magnetostriction  delay  lines. 

An  order  is  made  up  of  seven  N-digits,  three  A'-digits,  six  F-digits 
and  three  M-digits,  the  A'-digits  being  the  most  significant  and  the 
M-digits  the  least  significant.  The  A'-digits  allow  128  addresses  in 
the  immediate-access  store  (of  which  only  63  are  used).  The  reg- 
isters in  this  store  are  shown  in  Fig.  8.  The  A'-digits  refer  to  one 
of  the  accumulators,  the  registers  corresponding  to  A'-addresses 
0-7.  Thus  the  order  code  is  a  2-address  code  with  one  address 
referring  to  only  a  limited  part  of  the  store.  The  F-digits  indicate 
the  function  of  the  order.  A  list  of  fiuictions  and  their  correspond- 
ing F  values  are  given  in  the  appendix  of  this  chapter.  The  M-digits 
indicate  a  modifier  for  the  order:  they  select  one  of  the  accumula- 
tors, and  the  modification  process  is  to  add  certain  parts  of  the 
contents  of  the  selected  accumulator  to  the  order  before  it  is 


Chapter  9  j  The  design  philosophy  of  Pegasus,  a  quantity-production  computer  177 


ACCUMULATORS 
(0«  X  REGISTERS 
THESE  ARE 
T*  REGISTERS 

USED  FOR 
MODIFICATION 


ALWAYS  ZERO 
SINGLE- WORD  TRANSFER 


BLOCK  TRANSFERS 
TO  AND  FROM 
MAIN  STORE 


DOUBLE  LENGTH 


SPECIAL 
REGISTERS 


HAND  SWITCHES  (20  DIGITS) 

-  input/output    CHECKED  (5  DIGITS) 

'  OUTPUT  UNCHECKED  (5  DIGITS) 

—  ALWAYS  -  1  O 
  ALWAYS  H 

—  ALWAYS  2""-* 

—  ALWAYS  2"'-^ 


79 
80 


PROGRAMMERS 
NOTATION 


Fig.  8.  Allocation  of  addresses  In  store. 

obeyed,  the  part  chosen  depending  on  the  function  of  the  order 
to  be  modified.  Figure  9  gives  a  schematic  representation  of  the 
modification  process.  The  effect  of  modifying  an  order  depends 
on  the  function  of  the  order  and  can  be  to  make  the  effective  order 
length  22  digits.  This  e.xtension  is  neces.sary  when  specifying  an 
address  in  the  main  store. 

Transfers  of  information  can  take  place  between  the  computing 
store  and  the  main  store,  and  vice  versa,  either  in  single  words 
or  in  blocks  of  eight  words.  For  single-word  transfers,  only  the 
register  with  address  1  in  the  computing  store  is  involved.  For 
block  transfers  the  address  on  the  dnim  of  the  first  word  of  the 
block  must  be  divisible  by  eight,  and  the  registers  in  the  computing 
store  that  are  involved  will  be  one  of  the  discrete  blocks  indicated 
in  Fig.  8. 

Input  and  output  is  bv  means  of  punched  paper  tape.  .\n  "exter- 
nal conditioning'  order  is  included  in  the  code  to  enable  a  choice 
of  input  and  output  equipment  to  be  made.  In  the  standard 
machine,  two  tape  readers  are  used. 


A]\  stored  information  is  checked  (when  read)  by  means  of  a 
parity  digit,  which  is  such  that  the  total  number  of  I  s  in  anv 
correctly  stored  word  is  odd.  The  input  and  output  of  decimal 
characters  on  tape  can  be  checked  bv  a  similar  process. 

The  considerations  which  led  to  the 
specification  and  the  logical  design 

The  main  features  of  the  design  are 

a    The  use  of  a  computing  store  from  which  all  orders  and 
numbers  are  taken  while  computing 

/;    The  provision  of  multiple  accumulators 

c    The  provision  of  special  orders  and  facilities  for  dealing 
easily  with  "red  tape  ' 

The  computing  store.  The  use  of  a  fast-access  store  from  which 
all  numbers  and  orders  are  taken  increases  the  speed  of  the 
machine  and  eliminates  the  need  for  optimum  programming.  It 
is  this  computing  store  which  makes  it  possible  to  use  an  inexpen- 
sive magnetic  dnim  (with  a  relatively  long  access  time)  as  the  main 
store,  and  yet  have  a  machine  which  is  fast  and  relatively  simple 
to  programme.  On  the  other  hand,  programmes  have  more  "red 
tape"  and  are  not  as  simple  as  with  single-level  storage. 

Transfer  between  levels  is  in  blocks  of  eight  words;  this  is  a 
simplification  and  saves  time.  One  block  holds  a  reasonable  amount 
of  programme  and  other  blocks  hold  data.  Four  blocks  in  all  (32 
words)  would  be  just  sufficient,  and  Pegasus  was  originally  de- 
signed with  this  number.  The  design  was  subsequently  modified 
to  six  blocks,  which  is  quite  adequate,  in  conjunction  with  the 
seven  accumulators.  Any  further  increase  in  the  size  of  the  com- 
puting store  would  be  achieved  by  increasing  the  size,  not  the 
number,  of  blocks.  As  it  is  there  is  an  economic  balance  between 
the  usefulness  and  the  cost  of  the  computing  store. 

'"Red  tape"  is  an  e.vpression  for  the  non-arithmetic  orders  in  a  programme. 


n             A           F  M 
ORDER  BEiNO   MOD.FiEO     [.  .         .  .  .|.»  .  |  .  .  A  .  .j  .2  .] 

FUNCTIONS  10,7I.14.7S[^  »  »  ,w         -  '  //^ 
FUNCTIONS  72.73  76TT[»          ,00.  ,  ..J  .  .  .  1 

Shaded  portion  is  added 
to  the  order.  the  full 

li  DIGITS  ALVvAYS  APPEAR 

IN  X  REGISTERS  IN 
SIGNIPICANCE   SUCM  THAT 
'TME   most  SIGNIFICANT 
DiGiT  CORRESPONDS  TO 

2-'  (and  lEAST  SIGNIFICANT 
TO  2'"*) 

Fig.  9.  Order-modification  process. 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  2  [  Processors  with  a  general  register  state 


The  provision  of  several  accumulators.  This  is  the  most  novel 
feature  of  the  logical  design  of  Pegasus.  It  is  generally  agreed  that 
the  simplest  order  code  from  the  user's  aspect  is  the  3-address  code 
with  orders  of  the  form,  A  +  B  C.  An  examination  of  this 
form  of  code,  however,  shows  that  in  many  cases  two  of  the  ad- 
dresses are  the  same,  so  that  the  order  takes  the  2-address  form, 
A  +  B  ^  A.  A  further  examination  shows  that  in  a  large  propor- 
tion of  cases  the  address  A  is  confined  to  a  very  few  addresses. 
This  leads  to  the  suggestion  of  a  code  of  the  form  iV  -I-  X  — >  .Y, 
where  .V  covers  only  a  small  part  of  the  store  while  N  covers  the 
whole  store.  This  will  have  the  advantage  of  yielding  a  reasonably 
short  order.  In  Pegasus  two  such  orders  are  incorporated  in  one 
word,  leaving  sufficient  digits  to  specify  a  modification  register  (a 
Mancunian  B-line)  in  each  order. 

The  extreme  case  of  this  code  is,  of  course,  the  single-address 
code,  where  X  is  confined  to  one  address,  the  accumulator.  How- 
ever, experience  had  convinced  the  programmers  collaborating  in 
the  design  of  Pegasus  that,  with  single-addre.ss  codes,  a  large 
number  of  orders  are  concerned  solely  with  transfers  of  numbers 
from  one  register  to  another;  the  single  accumulator  is  a  restriction 
through  which  all  numbers  must  pass  and  in  which  all  operations 
have  to  be  performed. 

In  the  Manchester  University  computer  the  B-lines  serve  two 
very  valuable  but  distinct  purposes:  they  allow  order  modification 
and  rudimentary  arithmetic  (such  as  counting)  to  be  done  without 
disturbing  the  accumulator.  It  was  felt  that  fuller  arithmetic  and 
logical  facilities  on  these  B-lines  would  have  been  extremely  valu- 
able. The  seven  accumulators  in  Pegasus,  used  for  modification 
and  arithmetic,  are  a  development  of  the  B-line  concept. 

Special  facilities  for  dealing  with  'red  tape'.  The  difficulties  asso- 
ciated with  the  2-level  storage  system  have  been  greatlv  reduced 
by  having  an  order-modification  procedure  which  depends  on  the 
function  of  the  order  (Fig.  9).  This  method  of  modifying  orders, 
used  in  conjunction  with  order  66  of  the  code  (the  unit-modify 
order),  enables  the  counting  through  blocks  of  information  to  be 
done  with  relative  ease. 

The  use  of  the  group-4  orders  of  the  code  enables  counters  to 
be  set  conveniently  and  a  constant  (up  to  127)  to  be  placed  in 
an  accumulator,  the  constant  being  the  value  of  the  A'-digits  of 
the  order.  Order  67  (the  unit-count  order)  enables  the  counting 
of  cycles  of  operations  to  be  dealt  with  in  a  simple  way.  A  jump 
to  another  part  of  the  programme  can  be  programmed  to  take 
place  automatically  when  the  required  number  of  cycles  has  been 
performed. 


Having  a  large  number  of  jump  instructions  greatly  helps  in 
organizing  a  programme.  In  particular,  one  order  enables  a  jump 
to  be  made  depending  on  the  condition  of  an  accumulator  (being 
zero,  for  example),  and  another  order  on  the  complementary  con- 
dition (being  not  zero).  When  only  one  of  these  orders  is  available 
it  is  necessary  to  think  ahead  to  see  whether  or  not  the  correct 
condition  will  be  satisfied.  Although  the  eight  jump  instructions 
included  in  the  code  were  felt  initiallv  to  be  enough,  it  is  now 
suggested  by  programmers  that  even  more  such  orders  would  be 
helpful. 

The  logical  shift  orders,  52  and  53,  are  also  included  to  simplify 
'red  tape'.  In  particular,  they  are  used  for  packing  and  unpacking 
words  holding  several  items  of  information. 

.^s  a  result  of  including  these  various  orders,  the  order  code 
of  Pegasus  is  quite  large.  It  is  worth  remarking,  however,  that  by 
a  sensible  grouping  of  the  orders  in  the  code  the  remembering 
of  the  code  is  a  very  simple  task.  A  sensible  arrangement  of  the 
code  tends  to  reduce  the  amount  of  equipment  needed  to  engineer 
it.  For  example,  when  the  equipment  for  dealing  with  group  0 
of  the  code  has  been  allocated,  groups  I  and  4  require  the  addition 
of  only  three  gates. 

Facilities  for  checking  programmes.  The  features  mentioned  above 
make  the  computer  easier  to  programme,  and  there  are  other 
facilities  in  Pegasus  that  make  it  easier  to  check  out  and  develop 
new  programmes.  These  include  causing  the  machine  to  stop 
obeying  orders,  either  under  programme  control  or  when  the 
programme  is  in  error.  In  particular,  the  machine  stops  if  an  order 
for  writing  in  the  main  store  is  reached  and  an  overflow  indicator 
is  set.  A  further  aid  when  testing  new  programmes  is  the  automatic 
punching  out  of  all  main-store  addresses  appearing  in  block- 
transfer  orders.  When  this  information  is  examined  an  indication 
of  the  course  of  a  programme  is  readily  obtained.  The  punching 
can  be  inhibited  by  a  switch  when  a  return  to  full-speed  running 
is  needed. 

Machine  rhythm 

The  logical  design  of  Pegasus  is  built  around  a  nucleus  that  deals 
with  the  simple  arithmetic  orders,  groups  0,  1  and  4,  of  the  code. 
This  nucleus  contains  the  control  section,  i.e.  the  order  register 
and  order  decoding  equipment,  and  the  mill  in  which  these  orders 
are  executed.  The  design  of  this  nucleus  could  not  begin  until  a 
basic  rhythm  for  dealing  with  the  extraction  from  the  computing 
store  and  the  execution  of  such  a  pair  was  determined.  When  the 
outline  of  this  nucleus  was  clear,  the  equipment  for  dealing  with 
the  remaining  orders  in  the  code  was  designed  to  fit  it. 


Chapter  9  |  The  design  philosophy  of  Pegasus,  a  quantity-production  computer  179 


The  following  arguments  led  to  the  basic  rhvthm.  Since  the 
orders  of  groups  0,  1  and  4  are  similar  in  many  respects,  for 
definiteness,  it  will  be  sufficient  to  consider  a  particular  order.  1 1 
of  the  code,  say.  This  is  an  order  which  takes  two  numbers  from 
the  computing  store  and  replaces  one  of  them  by  their  sum.  It 
would  take  a  prohibitive  amount  of  equipment  to  extract  these 
numbers,  add  them  together  and  have  the  least  significant  digit 
of  the  sum  available  for  replacing  in  the  store  in  the  same  digit 
time  as  the  least  significant  digits  of  the  two  components  taken 
out  of  the  store.  In  practice,  some  four  digit  times  at  least  would 
be  needed  for  this  sequence  of  operations.  Thus,  it  would  be  im- 
possible to  return  the  sum  to  the  store  in  the  same  word  as  the 
operands  are  extracted  without  having  an  entry  point  to  each 
register  which  is  in  a  different  timing  from  the  normal  circulation 
entry.  To  produce  two  such  entry  points  to  each  register  would 
mean  more  equipment  associated  with  each  register,  which  was 
considered  an  uneconomical  use  of  extra  equipment.  Instead,  it 
was  decided  to  delay  the  siun  so  that  it  could  enter  the  register 
in  the  computing  store  in  the  next  word  time  in  standard  timing. 
This  involves  one  common  delaying  circuit  instead  of  one  for  every 
register.  Such  an  order  therefore  takes  two  word  times  to  execute. 
It  may  be  argued  that  this  second  word  time  could  be  made  to 
overlap  with  the  first  word  time  for  the  next  order.  Two  reasons 
oppose  this:  the  new  contents  of  the  register  being  changed  might 
be  required  by  the  next  order;  and  two  different  sets  of  equipment 
for  selecting  a  storage  register  would  be  needed  if  numbers  were 
to  be  extracted  from  one  and  replaced  in  another  register  in  the 
same  word  time. 

Thus,  the  execution  of  a  pair  of  orders  taken  from  the  comput- 
ing store  requires  four  word  times.  The  reasons  for  opposing  the 
overlapping  of  the  execution  of  two  orders  also  oppose  the  extrac- 
tion of  an  order  pair  while  the  previous  pair  is  being  dealt  with. 
Five  word  times  are  therefore  needed  for  the  process  of  e.xtracting 
and  obeving  a  pair  of  simple  arithmetic  orders.  More  time  may 
be  needed  for  some  of  the  other  orders  in  the  code. 

The  basic  3-beat  rhvthm  is  thus  established: 

a    Extract  the  order  pair  from  the  computing  store. 
b    Obev  the  first  order  of  the  pair. 
c    Obev  the  second  order. 

The  duration  of  beat  (a)  is  one  word  time:  beats  {b)  and  (c) 
are  each  two  word  times  long  for  orders  in  groups  0.  1.  4  and  6 
of  the  code,  but  mav  be  longer  for  other  orders. 


Times  for  typical  operations 

The  times  for  the  various  arithmetic  operations  are: 

millisec 

Addition  and  subtraction  0..3 

.Multiplication  2.0 

Division  5.4 

These  times  include  an  allowance  for  the  time  to  extract  the 
orders. 

Some  times  for  standard  subroutines  are; 

millisec 

Exponential  function   29 

Sine  fimction   24 

Logarithmic  function   34 

Finally,  to  give  some  indication  of  the  time  for  a  typical  prob- 
lem, a  set  of  50  simultaneous  equations  (with  a  single  right-hand 
side)  takes  about  lO''^  min.  Of  this  time,  3  min  8  sec  is  for  input, 
7  min  17  sec  is  for  calculation  and  18  sec  is  for  output. 

Realizing  the  specification 

The  detailed  logical  design 

It  would  take  too  long  to  describe  fuUv  the  detailed  logical  design. 
One  aspect  is  worth  mentioning,  however,  namely  the  avoidance 
of  all  'exceptions'  in  the  results  of  orders.  .\s  an  example  of  an 
exception  consider  the  overflow  indicators,  which  should  be  set 
whenever  the  final  result  of  an  order  is  outside  the  permissible 
range  of  numbers.  In  multiplication  this  can  occur  onlv  when  both 
the  multiplier  and  the  multiplicand  are  —  1,  and  this  is  likelv  to 
occur  ver\'  infrequentl) .  Rather  than  provide  equipment  to  sense 
this  infrequent  case,  it  is  easier  to  put  a  footnote  in  the  program- 
ming manual,  where  the  overflow  indicator  is  described,  pointing 
out  the  exception.  It  was  felt,  however,  that  such  exceptions  should 
be  avoided  even  at  the  expense  of  extra  equipment  or  extra  com- 
plication. For  this  and  other  reasons  concerned  with  facilitating 
machine  use,  the  logic  of  Pegasus  is  quite  complicated. 

The  end-product  of  the  detailed  logical  design  is  a  series  of 
diagrams  with  symbols  corresponding  to  the  circuit  units  of  the 
packages,  as  shown,  for  example,  in  Fig.  5.  The  inputs  and  outputs 
of  the  units  on  these  diagrams  correspond  to  the  pins  of  the  sockets 
into  which  the  packages  plug.  Thus,  the  wiring  lists  of  connections 
of  these  pins  can  be  produced  from  these  logical  diagrams.  The 
first  step  in  the  production  of  these  lists  is  to  allocate  a  position 


Part  2     The  instruction-set  processor  main-line  computers 


Section  2  |  Processors  with  a  general  register  state 


in  the  cabinets  to  each  logical  circuit  in  such  a  way  as  to  reduce 
the  amount  of  wire  needed.  When  the  layout  has  been  completed, 
the  last  stage  of  producing  the  wire  lists  can  proceed. 

General  construction  of  machine 

The  main  units  are  shown  in  Fig.  10. 

The  package  frame.  This  unit  is  a  simple  light-alloy  frame  sup- 
porting diecast  light-alloy  frame  racks  to  which  the  back  socket 
panels  are  fixed.  The  packages  slide  into  grooves  in  the  rack  and 
plug  into  sockets  at  the  back,  a  polarizing  feature  preventing  the 
insertion  of  a  package  upside  down.  If  electrical  or  magnetic 


screening  is  necessary  between  any  packages,  a  special  metal  plate 
is  inserted  in  slots  in  the  cast  rack  and  is  fixed  by  a  single  screw 
in  the  back  panel.  Coded  aluminium  strips  containing  coloured 
plastic  studs  which  identify  the  position  of  each  package  are  fixed 
to  the  front  of  each  casting. 

Arrangement  of  the  packages.  There  are  200  packages  per  cabinet, 
arranged  in  ten  horizontal  rows  of  20  units  per  row.  The  metal 
valve  panels  are  placed  so  that  the  edges  almost  touch.  The  com- 
ponent panel  of  each  unit  is  in  register  with  the  unit  in  the  corre- 
sponding position  in  each  of  the  other  rows,  therebv  providing 
vertical  chimnevs  for  cooling  the  components  secured  to  these 


Fig.  10.  Main  units. 


Chapter  9  j  The  design  philosophy  of  Pegasus,  a  quantity-production  computer  181 


panels.  Warm  air  from  the  main  source  of  heat,  the  valves,  is 
prevented  by  the  valve  panels  from  reaching  the  more  tempera- 
ture-sensitive components,  such  as  diodes,  secured  to  the  com- 
ponent panel. 

The  back  panel  wiring.  For  locating  long  signal  wires  between 
sockets  a  system  of  plastic  strips  is  used,  which  hold  the  wires 
at  definite  positions  given  by  the  instructions  on  the  wiring  lists. 
The  exact  route  of  every  wire  is  predetermined,  thus  making 
wiring  and  inspection  more  reliable  and  fault  finding  and  mainte- 
nance easier. 

Final  asscmbhj.  The  completely  wired  frame  is  assembled  in  its 
cabinet,  which  has  already  been  fitted  with  the  control  and  auxili- 
ary supply  circuit  unit,  heater  transformers,  fuses,  cooling  assembly 
and  cableforms.  The  work  of  connecting  the  cableforms,  heaters 
and  earths  can  be  done  bv  relatively  luiskilled  labour  working  to 
clearly  written  instnictions  and  diagrams. 

The  cooling  si/stein.  Each  cabinet  has  its  own  cooling  system  as 
an  integral  part  of  the  construction;  there  is  therefore  no  difficulty 
in  cooling  cabinets  added  to  existing  computers.  Two  axial-flow 
turbo  blowers  are  mounted  in  the  base  beneath  an  airtight  pressure 
chamber,  each  providing  300  ft^/min  of  air  at  a  total  pressure  head 
of  1  in  (water  gauge).  The  maximum  temperature  rise  is  10°  C. 

The  power  supplii .  A  separate  cubicle  houses  metal  rectifiers,  shunt 
stabilizing  valves  and  control  circuits.  The  power  is  obtained  from 
the  mains  through  a  motor-alternator  set,  the  output  of  which  is 
stabilized  to  2%,  the  main  purpose  of  this  set  being  to  act  as  a 
buffer  against  switching  surges  and  other  mains  voltage  variations. 
The  valve  heaters  in  the  computer  are  energized  from  the  stabi- 
lized alternator  output,  which  is  expected  to  extend  the  valve  life. 

Maintenance 

General 

All  digital  computers  so  far  have  a  fault  rate  which  cannot  be 
ignored.  When  the  best  has  been  done  in  the  choice  of  components, 
circuits  and  mechanical  constniction,  attention  must  be  paid  to 
the  following  points  to  get  the  best  out  of  a  machine; 

a    Rapid  fault  location 

/)    Getting  the  machine  working  again  as  soon  as  possible  after 
locating  a  fault 

c    Preventive  maintenance 


Fault  location 

There  are  parity-checking  circuits  on  both  the  main  and  the  high- 
speed stores.  Errors  of  a  single  digit  in  the  stores  stop  the  machine. 
The  fault  can  then  be  quicklv  located  by  examination  of  the 
monitors. 

For  other  faults  the  general  method  is  to  nm  a  test  programme 
(assuming  the  fault  is  not  in  the  main  control)  which  will  indicate 
the  area  of  the  fault.  Detailed  examination  can  then  be  carried 
out  with  the  monitors. 

AW  outputs  of  circuit  units  are  readily  accessible  at  monitoring 
sockets  on  the  front  of  each  package,  and  in  addition  about  80 
points  can  be  directly  selected  by  switches  from  the  monitoring 
position:  these  include  all  store  lines  and  a  number  of  key  wave- 
forms. Fault-finding  is  normally  a  matter  of  tracing  0"s  and  I  s 
through  the  machine  with  reference  to  logical  diagrams  rather 
than  electronic  circuit  diagrams. 

.\  varictv  of  triggers  can  be  selected  for  the  monitor  time-bases, 
these  including 

a    Trigger  at  any  word  position  within  a  dnmi  revolution  (128 
different  times  selectable  bv  switches) 

b    Trigger  at  an\  word  time  of  any  .selected  order 

These  triggers  and  some  other  monitoring  facilities  are  pro- 
duced bv  19  standard  packages  and  are  found  to  be  well  worth 
the  extra  equipment. 

Fault  repair 

Once  a  faulty  package  has  been  located,  the  machine  can  be  got 
working  again  immediately  by  replacement  of  the  package  with 
a  spare;  repair  of  the  faulty  package  can  be  done  at  leisure  with 
the  aid  of  a  package  tester.  With  this  equipment  a  package  can 
quickly  be  given  a  series  of  standard  tests;  each  is  selected  bv 
switches,  and  the  performance  is  measured  either  bv  observation 
of  meters  or  a  built-in  oscillograph. 

During  commissioning  not  one  case  was  found  of  the  first 
machine  doing  other  than  what  one  would  expect  from  the  logical 
diagram  ^except  for  a  very  few  cases  of  incorrect  wiring). 

Preventive  maintenance 

The  machine  h.t.  supplies  are  reduced  while  the  test  programmes 
are  being  mn.  This  marginal  testing  shows  up  incipient  faults  such 
as  deterioration  in  valves,  crystal  diodes  or  resistors.  The  machine 
is  at  present  kept  in  good  mnning  order  down  to  10%  margins 


Part  2     The  instruction-set  processor:  main-line  computers 


Section  2  |  Processors  with  a  general  register  state 


(the  supplies  are  normally  controlled  to  about  1%  of  nominal), 
although  correct  mnning  at  about  20%  reduction  has  been  ob- 
served. 

Conclusions 

The  first  machine  has  been  computing  regularly  for  only  a  few 
months  and  has  been  on  regular  preventive  maintenance  (about 
1  hour  per  day)  for  a  few  weeks.  Error-free  runs  of  over  .30  hours 
are  common,  and  at  the  time  of  writing  there  has  been  no  error 


for  55%  hours'  running.  The  majority  of  package  replacements  are 
done  during  routine  maintenance. 

The  packaged  method  of  construction  of  computers  has  proved 
to  have  great  advantages  in  design,  construction  and  operation. 

References 

ElliW,56a;  ElboR53;  ElliWSl,  52,  53,  56b;  FairJ56:  JohnD.52:  MerrI56; 
Pegasus  Programming  Manual.  Ferranti  Ltd.,  London;  Pegasus  Mainte- 
nance Manuals,  Ferranti  Ltd.,  London. 


APPENDIX 


The  Pegasus  Order  Code 


00  .V 

01  .V 

02  X 
0.3  X 

04  .Y 

05  .V 

06  .V 


=  )! 

=  X  +  n 

—  —n 

—  X  —  n 
=  n  —  X 
=  X  6{  n 

—  X  ^  n 


07  Not  allocated 

10  n'  =  X 

11  n'  =  n  -I-  X 

12  n'  =  -X 

13  n'  —  n  —  X 

14  n'  =  X  —  n 

15  n'  =  n  (x.  X 

16  n'  —  X 

17  Not  allocated 

20  {pqY  =  n  ■  X 

21  {pqY  =  n  ■  X  +  2-3« 

22  (pqi)'  =  p  -I-  2-3»q  -f  nx 


23  {nqY  =  n  +  2-3«,/ 


this  order  assumes  that  any 
overflow  is  due  to  opera- 
tions in  7.  Clears  overflow 
unless  n'  overflows 


241 
25 


0  <  p'/n  <  1  (unrounded 
division) 

— ^  p'/n  <C  ^2  (rounded 
division) 


26  (/'  -I-  2-3* 

27  Not  allocated 


—     <  p'/n  <      (rounded  single- 
length  division 


Not  allocated 


40  x' 

41  x' 

42  x' 

43  x'  =  .V  -  c  l  f  =  A'2- 

44  x' 

45  x' 

46  x' 

47  Not  allocated 


c 

X  -\-  c 


X  & 


50  x'  =  2% 

51  .r'  —  2r'^x  (rounded) 

52  Shift  X  up  A'  places 

53  Shift  .V  down  N  places 

54  {pqY  =  2^{pq) 

55  (pqY  =  2-"(p(/)  (un- 

rounded) 


single-length  arith- 
metical shifts 

single-length  logical 
shifts 

double-length  arith- 
metical shifts 


Note;  x'  —  x 
if  N  =  0 

Note:  p'  —  p 
and  q'  =  q 
if  ]V  =  0 


Chapter  9  |  The  design  philosophy  of  Pegasus,  a  quantity  production  computer  183 


56  (Normalize)  (pq)'  =  2>'{pq); 


=  X-  2-38y 


57  Not  allocated 


either  (1)  %  <  (pq)'  <  '/j  and 
-I  <  H  <  N  -  I 
or  (2)  -y2<(p(/)'<%and 

-1  <  li<  N  -  I 
or  (3)  -%  <  (p(/)'  <  14  and 
li  =  N  -  1 


73  Block  write  into  main  store 

74  External  conditioning 


60  Jump  to  N  if  x  =  0 

61  Jump  to  W  if  X  7^0 

62  Jump  to  A'  if  .v  >  0 

63  Jump  to  A'  if  .v  <  0 

64  Jump  to  A'  if  overflow  staticizor  clear;  clear  overflow  staticizer. 

65  Jump  to  A'  if  overflow  staticizor  set;  clear  overflow  staticizor. 

66  (Unit-modify)  .v^,  =  .v^  +  1.  Jump  to  A'  if  x'^  ^  0  (mod.  8) 

67  (Unit-count)     =  x^,  —  1.  Jump  to  N  if  x^  7^  0 

70  Single  word  read  to  accmnulator  1.  I'  =  s 

71  Single  word  write  from  accumulator  1.    s'  =  1 

72  Block  read  from  main  store  11'  =  h 


[Not  allocated 
16J 

77  Stop 

The  notation  used  here  is  as  follows: 
.V  is  the  first  address  (the  register  address)  in  an  order. 
.V  is  the  accumulator  specified  in  an  order. 
11  is  the  word  in  A'  before  obeving  the  order. 
X  is  the  word  in  X  before  obeying  the  order. 
p  and  (/  are  the  words  in  6  and  7  before  obeying  the  order. 
( p(i  f  =  p  +  2'-^^q,  with  q  >  0.  This  is  a  double-length  number, 
x',  n',  p'  and  q'  are  the  corresponding  values  after  obeying  the 
order. 

B  is  a  block  in  the  main  store  (  the  dnnn). 

(■  is  a  block  in  the  computing  store. 

P  is  the  position  number  of  a  word  within  a  block. 

OVR  is  the  overflow  indicator. 

xm  is  the  modifier  in  .V,  i.e.  an  integer  represented  bv  the  digits 
1  to  13  of  X. 

.\T  is  the  counter  in  .V,  i.e.  an  integer  represented  by  the  digits 
14  to  38  of  X. 


Chapter  10 

An  8-bit-character  computer 


Introduction 

We  present  in  this  chapter  the  result  of  an  exercise  to  design  an 
8-bit  computer.  Although  a  rather  trivial  machine,  it  is  not  without 
interest,  either  as  manipulator  of  variable-length  character  strings 
or  as  an  interpreter  of  more  complex  computers  in  a  role  similar 
to  a  microprogrammed  Pc.  In  the  latter  role  a  read-only  memory 
could  be  used  as  Mp  to  speed  up  the  Pc. 

This  computer  is  tvpical  of  8-bit  character-oriented  computers. 
Among  the  similar  machines  are  the  Interdata  Model  3,  the  RCA 
1600,  the  IBM  System/.360  Model  25,  and  the  Data  Machines  Inc. 
DMI  520/1.  A  processor  of  this  type  rarely  stands  alone  but  is  used 
with  a  fixed  program  in  the  following  ways:  as  a  control  in  a  larger 
C,  as  a  control  to  a  laboratory  or  other  complex  instrument,  and 
as  a  microprogrammed  processor  to  interpret  an  ISP.^ 

The  processor  must  perform  fixed-length  operations  on  both 
8-bit  characters  and  16-bit  addresses.  The  address  (double  length) 
operations  are  necessary  for  performance  reasons,  because  almost 
all  programs  operate  on  address  integers.  (For  example,  see  the 
program  on  page  185.)  Thus,  extending  (generalizing)  the  operation 
length  to  three  and  four  characters  is  comparatively  inexpensive. 
It  should  be  noted  that  a  processor  might  allow  the  operation 
length  to  be  specified  between  1  and  perhaps  2*  (256)  characters 
for  a  much  more  general  capability.  We  limit  the  directly  addressa- 
ble Mp  to  2^''  (or  65,384)  characters.  An  alternative  design  might 
allow  the  ma.ximum  addressable  Mp  to  be  2-^  words,  or.  alter- 
natively, it  could  be  variable.  Although  24-bit  operations  are 
defined,  their  implementation  might  be  expensive.  Aligning  the 
24-bit  words  on  32-bit-word  boundaries  would  simplify  the  address 
calculation  hardware. 

The  ISP 

The  basic  information  unit  is  the  8-bit  character.  Instmctions  are, 
in  general,  one  character  in  length.  However,  both  instmctions 
and  data  formats  are  of  variable  length,  instructions  being  1,  2, 
3,  4,  and  5  characters  long,  and  data  being  1,  2,  3,  and  4  characters 
long.  The  Pc  state  contains  —35  characters,  which  are  organized 
to  be  dealt  with  as  eight  8-,  16-,  24-,  or  32-bit  registers  (shown 


in  the  ISP  description  in  Appendix  1  of  this  chapter).  Of  these 
registers,  the  first  (register  0)  is  taken  to  be  a  special  accumu- 
lator, A. 

The  Pc  state  contains  both  operands  and  addresses  to  operands. 
The  instructions  to  load  or  store  register  A,  from  or  into  Mp,  with 
or  without  incrementing  a  general  register,  all  use  the  general 
registers  as  a  two-character  address  pointer.  Any  general  register 
may  be  loaded  or  stored  direct  from  or  to  Mp.  The  binary  arith- 
metic and  logical  operations  are  with  a  register  and  the  accumu- 
lator, and  leave  the  result  in  the  accumulator;  i.e.,  they  are  of  the 
form 

A  ^  A  b  R[r] 


Instruction  execution  . 
(0p  =  xxxyy2) 


(il 

A-^M[RDj-.R-^Rf  L' 
11) 

so 

M[RD]-^A 

(1) 

so1 

M[RD]-.-A;  R-^R-1-L' 
(1) 

Iri 

R-^  im 

(2-5) 

ari 

R-^R+  im 

(2-5) 

srd 

M[d]-^R 

(3) 

Ird 

R-^M[d] 

(3) 

odi 

R-^R  +  L' 

m 

sg1 

R-^R  -L' 

(1) 

br 

P-^R 

(2) 

bid 

(31 

cbr 

cbd 

cnr 

end 

(2) 

  f(r.i.N,Z,C)- 

(3) 

-(P-^f(s,d))  

(2) 

(3) 

ad 

A'— A  *  R 

0) 

ddc 

A'— A*R  +  C 

(1) 

sb 

A'-^A  -  R 

(1) 

sbc 

A'-^A-R-C 

(1) 

A'-^A«R(i) 

(11 

A'— A>R(fr) 

(1) 

dii 

A'-^A/R(i) 

(1) 

dit 

A—A/Rlfr} 

(1) 

and 

A— A  "  R 

(1) 

Of 

A-^A  »  R 

(1) 

xor 

A— AeR 

(1) 

cmpr 

N.Z-^A-R 

(1) 

Id 

A'— R 

(1) 

ST 

R— A 

(1) 

shi-ft 
A'-^A  «  2' 

11) 

si 
(1) 

Instructions  Formats 
Chorocter  Length 


Behovior^ 


I  °P  I  M 

0     4  7 


Direct  oddress 


;f££im  -  CI" 'JL~J  J2  2-5 
15        23  31  39 


Immediate  dota 


(  )  encloses  instruction  length  in  chorocters  shown  in  formats  table 
^See  state  diagram  .  Fig.  2 


^The  structure  should  be  compared  with  the  elaborate  microprogrammed 
IBM  System  360/Model  30  (Chap.  32). 


Fig.  1.  Instruction  coding  for  an  8-bit-character  computer. 


Chapter  10  j  An  8-bit-character  computer  185 


InsTrucTton  lengThs 
2  charocTers  3  chorocTers 


2-5  ChorocTers 


The  operoTion  specifiecj  by  The  instruction  q 
Operation  to  determine  locoTion  of  instruction  q 
Access  to  obtain  instruction  q 

Operotion  to  determine  vonobles  specified  by  instruction  q 
Access  to  obtoin  vonobles  or  return  resulT  vonobles 


The  instruction,  xor  3,  with  L  =  2,  is  coded 


Fig.  2.  An  8-bit-character-computer  instruction-interpretation  state  dia- 
gram, (a)  No  parameters;  (b)  integer  or  relative  address;  (c)  direct  ad- 
dress; (d)  immediate  data. 


The  general  registers  discussed  above  are  similar  to  those  of 
the  general  register  processors.  Since  it  is  assumed  that  this  type 
of  processor  might  be  used  to  interpret  another  ISP,  the  + 1  and 
—  1  instructions  provide  for  both  string  and  stack  memory  opera- 
tions. The  instructions  for  a  microprogrammed  P  and  the  I/O 
devices  are  not  defined.  For  example,  a  16-\vav  branch  instruction 
which  branched  to  one  of  16  locations  based  on  4  bits  of  the 
accumulator  might  facilitate  writing  an  interpreter. 

The  ISP  is  given  in  Appendix  1  of  this  chapter.  The  Pc  state 
is  organized  about  a  small  scratch-pad  memory,  although  Mp  could 
he  used  instead.  The  instruction  formats  and  the  operation  code 
assignments  are  shown  in  Fig.  1. 

The  instructions  behave  as  illustrated  in  the  state  diagram  (Fig. 
2).  For  example,  the  instruction  "Iri  .3,  .\907j,;"  is  coded 


00100,011    1010,1001  0000,0111 


and  the  effect  is 

R[.3]<0:15>^.\907j, 


11010,011 


and  the  effect  is 

R[()]<():23>       R[0]<0:23>  ®  R[3]<0:23> 

In  these  examples,  the  behavior  of  Iri  and  xor  is  specified  in  the 
state  diagrams  of  Fig.  Id  and  la,  respectivelv. 

An  open  subprogram  to  perform  the  n-component  vector 
(16-bit)  addition'  A  ^'b"  +X  is 

start    si  2  —  1  set  register  length  =  2 

Iri  4,  .\  set  up  vector  pointers  to 

Iri  5,  B  locations  A,  B.  C  in  Mp 
Iri  6,  C 

Iri  7,  2  X  n       set  up  count  at  2ii 

loop    lal  .5  fetch  B 

st  3  store  B  temporarily 

lal  6  fetch  C 

ad  3  add 

sti  4  store  in  A 

sul  7  decrement  n  count 

cnr  4,  loop  branch  if  negative  n 

The  above  program  loop  is  nine  characters  long.  A  program 
loop  for  the  IBM  System/36()  is  about  16  characters  long.  The 
setup  is  13  characters,  as  opposed  to  6  ~  16  characters  for  the 
360. 

Conclusions 

We  have  violated  our  principle  of  showing  "real"'  computers  bv 
designing  this  computer.  We  think  it  is  tvpical  of  a  small  processor, 
but  slightly  more  interesting. 


'The  length  is  specified  by  register  L. 


186  Part  2  I  The  instruction-set  processor:  main  line  computers 

APPENDIX  1    AN  8-BIT-CHARACTER  COMPUTER  ISP  DESCRIPTION 


Section  2  |  Processors  with  a  general  register  state 


Appendix  1 

An  8  Bit  Character  Computer  ISP  Description 

Pa  State 

The  following  array  of  8  general  registers^  are  mapped  into  the  first  8  x  (L+IJ  l-tp  cells.  The  register  length  is 
<P:8  X    (L+D)  -  !>•     The  first  register  of  each  array^  R[0]is  an  accumulator j  and  has  special  properties. 


I^[0:7]<0:  (8  X  L')  -1> 

A<D:(8  X  L')  -1> 
R(![0:7]<0:31> 

A(1<D:31> 
RT[0:7]<Ol23> 

AT<D:23> 
R0[0:7]<D; 15> 

AD<0 : 1 5> 
RS[0:7]<Or7> 

AS<0 :  7> 

The  following  flags  are 
to  A  to  form  A ' . 

N 

Z 

c 


=  M[0:7]  [0:L]<0;7> 
=  R[0]<D:  (8  XL')  -1> 
=  M|:0:7]  [0:3]<0:7> 
=  R(5[0]<0:31> 
=  M[0:7] [0:2]<0:7> 
=  RT[0]<0:23> 
=  M[0:7][0:l]<0:7> 
=  RD[0]<0: 15> 
=  M[0:7] [0:0]<D:7> 
=  R5[0]<0:7> 


General  Registers  of  length  (L-t-1)  x  8  hit 
Acaimulator  (generally) 
Quadruple  Registers 
Quadruple  Accumulator 
Triple  Registers 
Triple  Accumulator 
Uoubte  Registers 
Double  Accumulator 
Single  Registers 
Single  Accumulator 

the  result  of  all  arithmetic  and  logical  instructions  on  the  Accumulator j  A. 


These  are  connected 


A'<N,Z,C,0: 

L<0:  1> 


X  L')  -\>  :=  NDZaCaA<0: (8  X  L')  -1> 


Negative  result  flag 

Zero  flagj  set  if  the  register  contains  a  zero 

Carry  flag^  set  if  there  is  a  carry  or  borrow  from  bit  0  of  the 
addition 


2  bit  register  to  indicate  the  character  length  of  operations; 
1,2,3,4  for  S,D,T,Q 


P<CI :  1  5> 
Hp  State 

M[O:177777g]<0:7> 
Instruction  Format 
i[0:l|]<0:7> 
op<0:'l> 
r<0:2> 
s<0:7> 
d<0  :15> 


=  i[0]<0:'4> 

=  i[0]<5:7> 

=  i[l] 

=  i[1:2] 


imCO: (8  X   L ' )  -1> 


i[l:L']<0:7> 


Instruction  Interpretation  Process 

((instruction[0:I|]<D:7>  ^M[P:P+1(] 
((op  =  Olli)  V  (op  =  Un)  V  (op  = 
((op  =  IHO)  V  (op  =  1010))  ^  (P  »- 
(op  =  0104 )  ^  (P  ^  P  +  L+1 )  ;  next 
I ns t rue t  i  on^execut  ion) 


P  ^  P  +  1)  ; 
1001))  ^  (p  . 
P  +  1); 


P  +  ?1) 


Program  counter 

primary  memory 

2  to  5  character  instruction 
op  code 

register  address 

signed  integer  for  shifts 

address  integer 

variable  length  immediate  data 

fetch 


Chapter  10  !  An  8-bit  character  computer  187 


Instruction  Set 

cmd  Instruction  Execution  Process 

1  ns  t  ruct  i  on^xecut  i  on   ;»  ( 

la  ( 

=  op  = 

0)       (A  <-M[RD[r]]); 

load  A 

lal  ( 

=  op  = 

1)^  (A  ^M[RD[r]];  next  RD[r]  ^ 

IrOQu.  A,  f  iflCPGT^'&'n't 

sa  ( 

=  op  = 

2)       (M[RD[r]]  .-A); 

Bi07*S  A 

sal  ( 

=  op  = 

3)   -,(M[RD[r]]  ^A;  next  RO[r]  ^ 

[ r ]  +  L  '  )  ; 

stop©  A  J  'tncT'BTnGn't 

Iri  ( 

=  op  = 

M   -.  (R[r] 

loCld  I'6Qt8t67'  I'lrwiSd'tCliS 

ar  i  ( 

=  op  = 

5)       (R[r]  _  !m  +  R[r])  ; 

ctdd  t*&gi.8tST  i.TTfned'ictte 

srd  ( 

=  op  = 

6)  ^  (M[d]  ►-Rtr]); 

1  rd  ( 

=  op  = 

7)       (R[r]  ^M[d]); 

Zodd  T'egzstey* 

adi  { 

=  op  = 

01000)  ^  (R[r]  ^R[r]  +  L'); 

add  2  to  registsp 

5ul  ( 

=  op  = 

OlOOl)  -  (R[r]  -RCr]  -  L'); 

Bubtx'ctct  1  fvoTn  vegistev 

br  ( 

=  op  = 

01010)  -.  (P  ^  R[r])  ; 

hi^anoh  re  tum 

bid  ( 

=  op  = 

0101  1  )        (P  ^  d;   R[r]  ^  P)  ; 

branch  and  link  direct 

cbr  ( 

=  op  - 

01100)       ((cond  ^  0)  ->P  _P  +  s) 

conditional  branch  re lative 

cbd  ( 

=  op  = 

01101)  ^  ((cond  i<  0)  ^P  ^d); 

conditional  branch  direct 

cn  r  ( 

=  op  = 

OHIO)  -^  ((cond  =  0)   ^P  ^P  +  s) 

* 

conditional  not  branch  reZativs 

end  ( 

=  op  - 

01  1  1  1)  ^  (  (cond  =  0)  -»  P      d)  ; 

Condi tionctZ  not  branch  direct 

cond   ;=  ( 

ad  ( 

=  op  = 

10000)       (A'  ^  A  +  R[r]) ; 

add 

adc  ( 

=  op  = 

10001  )  -t  (A'  ^  A  +  R[r]+  C)  ; 

add  w%th  cax'nt 

sb  ( 

=  op  = 

10010)  -,  (A'  ^  A  -  R[r])  ; 

subtract 

sbc  ( 

=  op  = 

1001  1  )  -,  (A'  <_  A  -  R[r]  -  C)  ; 

subtract  with  carry 

mu  i  ( 

=  op  = 

lOlOO)  ^  (A'          X  R[r]  (i}); 

integer  multiply 

tnu  f  ( 

=  op  = 

lOIOl )  ^  (A'  ^  A  X  R[r]  [fr]) ; 

jractvon  fniiit'vp ty 

di  i  ( 

=  op  = 

lOIIO)  -,  (A'  ^A  /  R[r]  {!}); 

tnteger  dtrVi-de 

dif  ( 

=  op  = 

lOlll)  -,  (A'  <-A  /  R[r]  {fr}); 

fraction  divide 

and  ( 

=  op  = 

1  1000)  ->  (A  <-  A  A  R[  r])  ; 

logical  and 

=  op  = 

llOOl)  ^  (A  »-A  V  R[r]); 

xor  ( 

=  op  = 

llOlO)  -.  (A  <- A  e  R[r]); 

exclusive  or 

cmp  r  ( 

=  op  - 

1  101  1  )  ^  (ItaZ  „  A  -  R[  r])  ; 

compare  used  to  N  and  Z 

Id  ( 

=  op  = 

1 1100)  ^  (A'  ^  R[r]); 

load 

St  ( 

=  op  = 

1  1  101  )  ->  (R[  r]       A)  ; 

store 

Shi  ft (:=  op  = 

1 1 1 10)  ^  (A'  ^  A  X  2^) ; 

shift  right  or  left 

si  ( 

=  op  = 

111  11)       (L  ^  r) 

set  operation  length 

end  Instruction^xecution 

Part  3 


The  instruction-set  processor  level: 
variations  in  the  processor 

In  this  part  we  discuss  computers  whose  ISP's  are  variations  from  the  main-line 
computers  in  Part  2.  These  variations  represent  historical  computers  that  have  not 
remained  viable  in  the  judgment  of  the  computer  engineering  community,  responses 
to  particular  technology,  and  explorations  that  were  either  too  advanced  for  their 
time  or  still  exist  as  open  options. 

Section  1,  Processors  with  greater  than  1  address  per  instruction,  is  mostly  of 
historical  and  comparative  interest.  The  general  register  organization  with  large  Mp's 
(hence  large  addresses)  almost  surely  dominate  them. 

Section  2,  Processors  constrained  by  a  cyclic,  primary  memory,  describes  a 
response  to  a  historical  feature  of  Mp  technology.  The  use  of  a  drum,  delay  line, 
or  disk  was  a  matter  of  necessity  rather  than  choice.  When  better  random  access 
core  memories  were  available,  the  drum  ceased  to  be  a  primary  memory  component. 

Section  3  presents  processors  for  variable  string  data.  These  processors  are  no 
longer  built  in  their  original  form.  However,  they  were  very  successful  for  a  while 
(IBM  1401).  Furthermore,  string  data-types  have  been  incorporated  in  later  proc- 
essors. 

Section  4  presents  two  desk  calculator  computers.  Although  we  too  often  dismiss 
these  devices  as  mere  desk  calculators,  they  have  facilities  that  qualify  them  as 
general  purpose  stored  program  computers.  Unlike  most  computers,  because  of  the 
production  cost  constraint,  these  calculator  computers  are  all  very  cleverly  designed. 

Section  5,  Processors  with  stack  memories,  describes  an  organization  that  has 
never  reached  the  main  line  state.  Nevertheless,  the  idea  of  a  stack  memory  is 
gradually  being  assimilated.  For  example,  the  DEC  PDP-6  and  PDP-10  computers 
use  their  general  registers  for  stack  pointer  control,  as  suggested  in  Chap.  3,  page 
62. 

In  Sec.  6  the  ideas  of  multiprogramming  are  presented.  These  ideas  are  recent 
and  have  not  yet  been  adequately  incorporated  in  main  line  designs.  They  undoubt- 
edly will  be  standard  features  in  the  next  generation,  although  the  exact  form  can- 
not yet  be  known. 


189 


II 


Section  1 


Processors  with  greater  than  1 
address  per  instruction 


Multiple-address  Instruction  formats  exist  for  several  reasons. 
The  addition  of  an  explicit  address  to  determine  the  next  In- 
struction occurs  with  cyclic  Mp's  to  make  them  efficient.  Section 
2  Is  devoted  to  this  case,  and  It  will  not  be  considered  further 
here.  These  processors  are  known  as  n  +  1  address.  A  second 
reason  is  that  many  operations  have  more  than  one  operand 
(as  in  A  +  B  or  A  V  B),  and  It  seems  to  be  efficient  encoding 
to  put  them  all  Into  an  instruction.  A  third  reason  is  that  many 
operations  need  to  be  followed  by  writing  the  result  In  memory, 
to  permit  the  Pc  to  be  used  for  operations  on  other  data.  Thus, 
coupling  each  operation  with  the  address  where  the  result 
Is  to  be  stored  seems  to  be  advantageous.  However,  In  evalu- 
ating complex  arithmetic  expressions,  more  instruction  bits  and 
memory  references  are  required  than  in  a  single-address  com- 
puter. Also,  for  unary  operators  one  address  field  is  unused. 
It  seems  fair  to  say  that  ISP  organizations  with  two  or  three 
addresses  have  not  proved  themselves  in  competition  with  the 
main  line  of  1,  (1  +  Index),  or  (1  +  general  register)  organiza- 
tions. However,  no  definitive  demonstration  of  their  Inefficiency 
under  all  technological  conditions  exists,  and  they  are  worth 
studying. 

For  microprogrammed  processors,  multiple-address  instruc- 
tions allow  a  high  degree  of  parallelism  to  be  obtained  in  a 
single  instruction.  Multiple-address  formats  survive  In  this  form. 


The  Pilot  ACE 


The  National  Physics  Laboratory's  Pilot  ACE  Is  the  first  of 
several  cyclic  memory  computers  which  have  been  designed  to 
provide  optimum  coding  of  Instructions.  Subsequent  machines 
which  It  Influenced  Include  the  nearly  Identical  English  Electric 
Deuce,  the  Bendix  G-15,  and  the  Packard  Bell  PB-250.'  The 
RMS  structure  does  not  strictly  follow  our  lattice  model  (page 
65).  The  Deuce  PMS  structure  is  given  In  Fig.  1.  A  32-word 
block  in  Mp.delay„line  can  be  transferred  to  Ms. drum  in  one 
instruction  (transfer  time  of  1,024  /j,s).  Another  capability  of 


^H.  D.  Huskey  was  involved  in  the  design  of  ACE,  G-15,  and  PB-250:  he  was 
undoubtedly  the  idea  carrier. 


ACE  allows  it  to  perform  operations  on  vectors  of  up  to  32 
elements  In  1  instruction. 

The  ACE  structure  (Chap.  11)  has  a  common  M  which  con- 
tains much  of  the  processor  state  and  Mp.  Many  of  the  locations 
used  for  processor  state  can  store  programs  for  direct  execu- 
tion. The  diagram  on  page  198  in  Chap.  11  describes  the  in- 
struction execution  process  and  implementation. 

Alan  M.  Turing  Is  credited  with  the  basic  design  of  ACE 
(see  Introduction,  page  193,  and  Turing's  biography  [Turing, 
1959]). 

ZEBRA,  a  simple  binary  computer 

ZEBRA  Illustrates  the  organizational  details  of  another  serial 
arithmetic  computer  with  Mp. cyclic.  ZEBRA,  like  ACE,  allows  the 
user  to  construct  instructions  for  the  hardware  which  are  almost 
directly  Interpreted.  In  both  ACE  and  ZEBRA  very  little  decoding 
IS  built  into  the  machine;  a  large  Instruction  set  Is  available 
since  the  instructions  are  microcoded.  In  these  computers  the 
programming  problem  can  be  as  complex  as  the  user  wishes, 
because  a  large  number  of  different  instructions  can  be  micro- 


Hp(#0:8)i- 


MplCg:  10)' — s 


—  i 
Mps 

"delay" 

r  — 
_K_ 

1  ine; 

-K— 

~I0  w_ 

-T. console  - 
Tfcard;  reade 
|_32/80  card 


card/min;J. 


moving  head  drum;  8192  w; 
32  b/w;   16  tracks/posi- 
tion; 32  w/track;  16  pos i - 
t  i  ons 


'MpCdelay  line;  cyclic;  32  ~  102')  p.s/w;  32  w;  32  b/w) 
^Pc(technology:   vacuum  tubes;   1955  ~  1961;    (2-H)  address/ 
Instruction;  ancestors:  NPL  ACE) 


Fig.  1.  English  Electric  Deuce  PMS  diagram. 


191 


192   Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  |  Processors  with  greater  than  1  address  per  instruction 


coded.  The  LGP-30  (Chap.  16),  by  contrast,  has  only  a  basic 
instruction  set.  Hence  a  problem  can  be  coded  only  one  or  two 
ways.  ZEBRA'S  performance  of  50  percent  memory-cycle  utiliza- 
tion is  rather  outstanding  and  raises  the  possibility  that  ran- 
dom-access primary  memories  may  not  be  necessary. 

UNIVAC  scientific  (1103A)  instruction  logic 

The  UNIVAC  1103A  (Chap.  13)  is  a  two-address  computer.  The 
computer  was  designed  initially  by  Engineering  Research  Asso- 
ciates (ERA)  of  St.  Paul.'  UNIVAC  acquired  ERA  in  1952  as  a 
scientific-computer  division.  The  evolution  of  the  1103A  later 
yielded  the  1107  and  1108  general  register  processors.  The 
reader  should  compare  the  1103A  with  the  IBM  704  series 
(Chap.  41).  At  the  time  both  were  used,  it  was  not  clear  which 
computer  was  better. 

'As  the  third  in  a  series  that  started  with  the  ERA  1101  and  1102. 


The  RW-4C)0:  a  new  polymorphic  data  system 

The  RW-400  in  Chap.  38  is  a  two-address,  binary  computer.  It 
is  discussed  in  Part  5,  Sec.  4,  page  470. 

Instruction  logic  of  the  MIDAC 

The  University  of  Michigan's  MIDAC  (Michigan  Digital  Auto- 
matic Computer)  is  based  on  the  National  Bureau  of  Standards' 
SEAC  (Standards'  Electronic  Automatic  Computer).  MIDAC,  a 
three-address,  binary  computer,  is  presented  in  Chap.  14. 

Instruction  logic  of  the  Soviet  Strela  (Arrow) 

The  Russian  Strela  is  presented  in  Chap.  15.  Since  it  is  used 
only  to  illustrate  a  three-address  organization,  the  chapter  con- 
sists of  only  the  instruction  set. 


Chapter  11 


The  Pilot  ACE^ 

/.  H.  Wilkinson 
Introduction 

A  machine  which  was  almost  identical  with  the  Pilot  ACE  was 
first  designed  by  the  staff  of  the  Mathematics  Division  at  the 
suggestion  of  Dr.  H.  D.  Huskey  during  his  stay  at  the  National 
Physical  Laboratory  in  1947.  It  was  based  on  an  earlier  design 
bv  Dr.  A.  M.  Turing  and  its  principal  object  was  to  provide  experi- 
ence in  the  construction  of  equipment  of  this  type.  It  was  not 
intended  that  it  would  be  used  on  an  extensive  programme  of 
computation,  but  it  was  hoped  that  it  would  give  practical  experi- 
ence in  the  production  of  subroutines  which  would  serve  as  a 
useful  guide  to  the  design  of  a  full  scale  machine.  An  attempt  to 
build  the  Pilot  Model,  during  Dr.  Huskev's  stay,  was  unsuccessful, 
but  a  year  later  after  the  formation  of  an  Electronics  Section  at 
the  NPL  a  combined  team  consisting  of  this  section  and  four 
members  of  the  Mathematics  Division  started  on  the  construction 
of  a  Pilot  Model,  the  design  of  which  was  taken  over  almost 
unchanged  from  the  earlier  version.  The  machine  first  worked,  in 
the  sense  that  it  carried  out  automatically  a  simple  sequence  of 
operations,  in  May  1950  and  by  the  end  of  that  year  it  had  reached 
the  stage  at  which  a  successful  Press  Demonstration  was  held.  The 
successful  application  of  the  machine  to  the  solution  of  a  number 
of  problems  made  it  apparent  that,  in  spite  of  its  obvious  short- 
comings, it  was  capable  of  being  converted  into  a  powerful  com- 
puter comparable  with  anv  then  in  existence  and  much  faster  than 
most.  Accordingly  a  small  programme  of  modifications  was  em- 
barked upon  early  in  1951,  but  the  machine  was  not  functioning 
satisfactorily  again  until  November  of  that  year,  .\fter  a  month 
of  continuous  operation  it  was  transferred  from  the  Electronics 
Section  to  Mathematics  Division  where  it  has  since  been  in  use 
on  a  1.3-hour  day.  During  its  first  year  of  full  scale  operation  it 
achieved  a  65%  serviceability  figure  based  on  a  very  strict  criterion. 
Its  performance  during  its  second  year  has  so  far  been  considerably 
better  than  this. 

Wutamatic  Digital  Computation.  National  Physical  Lahoratonj.  Tedding- 
ton.  England,  pp.  5-14,  March,  1953. 


General  description 

The  Pilot  .\CE  is  a  serial  machine  using  mercury  delay  line  storage 
and  working  at  a  pulse  repetition  rate  of  1  megacycle/sec.  Its  high 
speed  store  consists  of  11  long  delay  lines  each  of  which  stores 
.32  words  of  32  binary  digits  each,  with  a  corresponding  circulation 
period  of  1024  microseconds.  5  short  lines  storing  one  word  each 
with  a  circulation  period  of  .32  microseconds  and  two  delay  lines 
storing  two  words  each.  It  was  inevitable  that  in  the  design  of 
a  machine  originally  intended  for  experimental  purposes,  over- 
riding consideration  should  be  given  to  the  minimization  of  equip- 
ment rather  than  to  making  the  machine  logically  satisf)'ing  as 
a  whole.  This  is  reflected  to  a  certain  extent  in  the  code  adopted 
for  the  machine  and  in  its  arithmetic  facilities,  which  are  in  gen- 
eral fairly  rudimentary.  The  design  of  the  machine  was  also  de- 
cisively influenced  bv  the  attempt  to  overcome  the  loss  of  speed 
due  to  the  high  access  time  of  the  long  storage  units.  The  machine 
in  fact  uses  what  is  usually  known  as  a  system  of  "optimum 
cochng." 


Code  of  Pilot  ACE 

The  Pilot  .\CE  may  be  said  to  have  a  "three-address  code"  though 
this  form  of  classification  is  not  particularly  appropriate.  Each 
instruction  calls  for  the  transfer  of  information  from  one  of  -32 
"sources"  to  one  of  32  "destinations"  and  selects  which  of  eight 
long  delay  lines  will  provide  the  next  instruction.  This  third 
address  is  necessary  because  consecutive  instructions  do  not  occupy 
consecutive  positions  but  are  placed  in  such  relative  positions  that, 
in  so  far  as  is  possible,  each  instruction  emerges  during  the  minor 
cycle  in  which  the  current  instruction  is  completed.  .\n  unusual 
feature  of  the  instructions  is  that  the  transfers  they  describe  may 
last  for  any  number  of  consecutive  minor  cycles  from  one  to  thirtv- 
two.  The  instruction  word  contains  three  other  main  elements 
which  are  known  as  the  wait  number,  the  timing  number  and  the 
characteristic  which  together  determine  when  the  transfer  starts, 
when  it  stops  and  which  instruction  in  the  selected  instruction 


193 


194  Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  I  Processors  with  greater  than  1  address  per  instruction 


source  is  the  next  to  be  obeyed.  The  structure  of  the  instruction 
word  is  as  follows; 


Next  instruction  soiuce 

Digits  2-4 

Source 

Digits  5-9 

Destination 

Digits  10- 

14 

Characteristic 

Digits  15- 

16 

Wait  number 

Digits  17- 

21 

Timing  number 

Digits  25- 

29 

Go  digit 

Digit  32 

The  remaining  digits  are  spare. 

Coding  of  a  problem  takes  place  in  two  parts,  in  the  first  of 
which  onlv  the  source,  the  destination  and  the  period  of  transfer 
are  specified,  the  last  being  a  fimction  of  the  characteristic,  wait 
number  and  timing  number.  In  the  second  part,  the  detailed  cod- 
ing, the  other  elements  are  added. 

The  sources  and  destinations 

Simplest  among  the  sources  and  destinations  are  those  associated 
with  the  short  delay  lines.  The  six  one-word  delay  lines  are  each 
given  numbers  and  these  for  reasons  associated  with  the  history 
of  the  machine  are  11,  15,  16,  20,  26  and  27.  They  are  usually 
referred  to  as  Temporary  Stores  or  TS  s  because  they  are  used  to 
store  temporarily  those  numbers  which  are  being  operated  upon 
most  frequently  at  each  stage  of  a  computation.  In  general  TSn 
has  associated  with  it  a  source,  source  n,  and  a  destination,  des- 
tination n.  An  instruction  of  the  type 

15-16 

in  the  preliminary  stage  of  the  coding  represents  the  transfer  of 
a  copy  of  the  contents  of  TS15  via  source  15  to  TS16  via  the 
destination  16.  After  it  has  taken  place  both  stores  contain  the 
number  originally  in  TS15.  The  period  of  the  transfer  is  not 
mentioned  in  the  coding  because  a  transfer  of  more  than  one  minor 
cycle  is  irrelevant.  Most  transfers  are  for  one  minor  cvcle  and 
hence  the  period  of  transfer  is  not  specified  unless  it  is  greater 
than  one  minor  cycle.  Associated  with  the  TS  s  are  a  number  of 
functional  sources  and  destinations.  TS16  for  instance  has  two 
other  destinations  17  and  18  associated  with  it,  in  addition  to 
destination  16.  Any  number  transferred  to  destination  17  is  added 
to  the  contents  of  TS16  while  any  number  transferred  to  destina- 
tion 18  is  subtracted  from  the  contents  of  TS16.  TS16  may  be  said 
to  have  some  of  the  fimctions  associated  with  the  accumulator 


on  an  orthodox  machine.  The  period  of  transfer  to  destinations 
17  and  18  is  verv  important.  Thus 

15-17  (n  minor  cvcles) 

has  the  effect  of  adding  the  contents  of  TS15,  n  times  to  the 
contents  of  TS16.  This  prolonged  transfer  is  used  in  this  way  to 
give  small  multiples  (up  to  32)  of  numbers.  Similarly,  we  may  have 

15-  18       (n  mc) 
The  instruction 

16-  17       (n  mc) 

is  of  special  significance  because  it  has  the  effect  of  adding  the 
content  of  TS16  to  itself  for  each  minor  cycle  of  the  transfer,  that 
is  it  gives  multiplication  by  2"  or  a  left  shift  of  n  binary  places. 

TS26  has  associated  with  it  a  number  of  functional  sources. 
Source  17  gives  the  ones  complement  of  the  number  in  TS26, 
Source  18,  the  contents  divided  by  2,  and  Source  19,  the  contents 
multiplied  by  2.  The  instruction 

18-  26       (n  mc) 

thus  has  the  effect  of  dividing  the  contents  of  TS26  by  2",  that 
is  a  right  shift  of  n  places.  Similarly 

19-  26       (n  mc) 

gives  a  left  shift  of  n  places. 

There  are  two  functional  sources  which  give  composite  func- 
tions of  the  numbers  in  TS26  and  TS27.  These  are  Source  21  which 
gives  the  number 

TS26  &  TS27 

and  Source  22  which  gives  the  number 
TS26  ^  TS27 

There  are  a  number  of  sources  which  give  constant  numbers  which 
are  of  frequent  use  in  computation.  These  are  Source  23  which 
gives  the  number  which  has  a  zero  everywhere  except  in  the  17th 
position,  usually  known  as  P17,  Source  24  which  gives  P32,  Source 
25  which  gives  PI,  Source  28  which  gives  zero  and  Source  29 
which  gives  a  number  consisting  of  32  consecutive  ones.  These 
sources  are  valuable  because  they  provide  numbers  with  an  access 
time  of  one  minor  cycle  and  are  thus  almost  as  useful  as  several 
extra  TS's. 

The  use  of  a  number  of  TS  s  with  the  arithmetic  facilities 
distributed  among  them  makes  it  possible  to  take  advantage  of 
the  placing  of  instructions  in  appropriate  positions  in  the  long 


Chapter  n  [  The  Pilot  ACE  195 


storage  units  so  that  they  emerge  as  required.  The  coding  of  a 
trivial  example  will  illustrate  the  uses  of  the  TS's  and  their  asso- 
ciated sources.  It  is  required  to  build  up  the  successive  natural 
numbers,  their  squares  and  their  cubes  simultaneously.  It  is  natural 
to  store  the  values  in  TS's  and  we  mav  suppose  TS1.5  contains 
n,  TS20,  n-'  and  TS26,  n^. 


Itistniction 

Piaription 

1. 

28-15 

zero  to  TS15  i.e.  0 

These  3  instructions  set  the 

2. 

28-20 

zero  to  TS20  i.e.  0- 

initial  values 

3. 

28-26 

7prn  tn  T'^9fi  i  ^■  O'.i 

4. 

26-16 

TS16  contains 

5. 

20-17  (3mc) 

TS16  contains  n""  + 

3n- 

6. 

15-17  (3mc) 

TS16  contains  n^  + 

3n-  +  3n 

7. 

25-17 

TS16  contains  n'  + 

3n-  +  3n  +  1 

8. 

16-26 

TS26  contains  (n  + 

1)< 

9. 

20-16 

TS16  contains  n- 

10. 

15-17  (2mc) 

TS16  contains  n-  + 

2n 

11. 

25-17 

TS16  contains  n-  + 

2n  +  1 

12. 

16-20 

TS20  contains  (n  + 

D- 

13. 

15-16 

TS16  contains  n 

14. 

25-17 

TS16  contains  (n  + 

1) 

15. 

16-15 

TS15  contains  (n  + 

1)  Next  instruction  (4) 

The  instnictions  (1)  to  (.3)  set  the  initial  conditions.  The  instruction 
(4)  —  (15)  have  the  effect  of  changing  the  contents  of  1.5,  20,  26 
from  n,  n^,  n^  to  (n  -I-  1),  (n  +  1)-,  (n  +  1)''.  .\s  remarked  earlier, 
each  instruction  selects  the  next  instruction  and  here  instniction 
(15)  selects  instruction  (4)  as  the  next  instniction.  In  the  prelimi- 
nary coding  this  is  usually  denoted  by  using  an  arrow;  it  must  be 
catered  for  in  the  detailed  coding  by  the  correct  choice  of  the 
timing  number,  as  will  be  shown  below. 

The  branching  of  a  programme  is  achieved  by  the  use  of  two 
destinations,  destination  24  and  destination  25.  If  a  transfer  is  made 
from  any  source  to  destination  24  then  the  next  instniction  is  one 
or  other  of  two  according  as  the  number  transferred  is  positive 
or  negative.  Similarly  if  a  transfer  is  made  to  destination  25  then 
the  next  instruction  is  one  or  other  of  two  according  as  the  number 
transferred  is  zero  or  non-zero.  In  the  preliminary  coding  the 
bifurcation  is  denoted  bv  the  use  of  arrows,  thus: 


15-24 


-f  ve  —  ve 


In  the  detailed  coding  the  effect  is  that  if  the  number  transferred 
to  destination  24  is  negative  then  the  timing  number  is  increased 


by  1 .  Similarly  for  destination  25;  the  two  possible  next  instructions 
are  consecutive  in  the  store. 

The  two  double  word  stores  are  numbered  DS12  and  DSI4. 
DS12  has  only  source  12  and  destination  12  associated  with  it, 
but  DS14  has,  in  addition  to  source  14  and  destination  14,  a 
number  of  functional  sources  and  destinations.  Source  1.3  gives  the 
contents  of  DSN  divided  by  2,  while  transfers  to  destination  13 
have  the  effect  of  adding  the  numbers  transferred  to  DS14.  In 
specifying  transfers  from,  and  to,  the  double  length  stores,  the  time 
of  the  transfer  must  be  specified,  i.e.  whether  it  takes  place  in  an 
even  or  an  odd  minor  cycle  or  both.  Thus  the  transfer 

12-14       (odd  minor  cvcle)  usually  written 
12-14  (o) 

represents  the  transfer  of  the  word  in  the  odd  positions  of  DS12 
to  the  odd  position  in  DS14  while 

12-  14       (2  minor  cycles) 

represents  the  transfer  of  both  words  in  12  to  the  corresponding 
positions  in  14.  The  operation 

13-  14  (2n) 

gives  us  a  method  of  shifting  the  contents  of  TS14  n  places  to  the 
right  while 

14-  13  (2n) 

produces  a  shift  of  n  places  to  the  left. 

The  machine  is  not  equipped  with  a  fully  automatic  multiplier. 
To  multiply  two  numbers,  a  and  b,  together,  a  must  be  sent  to 
TS20.  b  to  DS14  odd,  zero  to  DS14  even  and  a  transfer  (source 
irrelevant)  made  to  destination  19.  The  product  is  then  produced 
in  DS14  in  2  milliseconds,  but  a  and  b  are  treated  as  positive 
numbers.  Corrections  must  be  made  to  the  answer  if  a  and  b  are 
signed  numbers.  To  make  multiplication  fast,  it  has  been  made 
possible  to  perform  other  operations  while  multiplication  is  pro- 
ceeding. Thus  the  corrections  necessary  if  a  and  b  are  signed 
numbers  may  be  built  up  in  TS16  during  multiplication,  and  signed 
multiplication  takes  only  a  little  over  two  millisecs.  It  is,  of  course, 
therefore,  a  subroutine  but  a  very  fast  one.  The  amount  of  equip- 
ment associated  with  the  multiplier  is  very  small.  The  main  part 
of  the  store  consists  of  the  long  storage  units  known  as  DLl,  DL2, 
.  .  .  ,  DLll.  Each  of  these  has  a  source  and  a  destination  with  the 
same  number  as  the  DL  number.  The  words  in  each  DL  are 
numbered  0  to  31  and  the  nth  word  in  DLM  is  usually  denoted 
by  DLMjj.  Transfers  to  and  from  long  lines  in  the  preliminary 
coding  are  denoted  thus: 


196  Part  3  j  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  |  Processors  with  greater  than  1  address  per  instruction 


8„-    16      (transfer  nth  word  of  DL8  to  TS16) 
8^_^-n      (add  all  the  words  from  8„  to  8„  i.e.  n  —  ni  +  1  con- 
secutive words  of  DLS'"  to  TS16) 

Detailed  coding 

In  the  second  stage  of  the  coding  the  true  instruction  words  are 
derived  from  the  preliminary  coding.  This  is  a  fairly  automatic 
process  and  recent  experience  has  shown  that  it  can  be  carried 
out  satisfactorilv  by  quite  junior  staff.  The  timing  of  each  instruc- 
tion is  given  relative  to  the  position  of  that  instruction  in  the  store. 
This  is  an  incidental  feature  of  the  code  which  arose  from  the 
attempts  to  minimize  equipment.  It  would  be  dropped  in  any 
future  machine  in  favour  of  an  absolute  timing  system.  If  an  in- 
struction occupies  position  ni  in  a  DL  and  has  a  wait  number 
W  and  timing  number  T  then  the  transfer  always  begins  in  minor 
cycle  (m  -f  W  -t-  2)  and  the  next  instruction  is  always  in  minor 
cycle  (m  -I-  T  -I-  2)  of  the  selected  next  instruction  source.  The 
period  of  transfer  depends  on  the  value  of  the  characteristic.  If 
the  characteristic  is  zero  then  the  transfer  lasts  for  the  whole 
period  from  (m  -|-  W  -f-  2)  to  (m  -I-  T  -H  2),  that  is  (T  -  W  -|-  1) 
minor  cycles.  If  the  characteristic  is  one,  then  the  transfer  is  for 
one  minor  cycle,  that  is  minor  cycle  (m  -|-  W  -|-  2).  If  the  charac- 
teristic is  three  then  the  transfer  is  for  two  minor  cycles 
(m  -I-  W  -(-  2)  and  (m  -|-  W  -|-  3).  The  characteristic  value,  two, 
is  not  used.  The  characteristic  value  zero  gives  a  prolonged  transfer 
which  is  peculiar  to  the  Pilot  ACE.  The  characteristics  1  and  .3 
are  analogous  to  the  facility  on  EDSAC  whereby  full  length  or 
y2-length  words  may  be  transferred.  On  the  Pilot  .\CE  we  transfer 
single  or  double  length  words.  This  facility  is  invaluable  for  double 
length,  floating  and  complex  arithmetic.  In  the  above  definitions 
the  numbers  (m  -I-  W  -|-  2)  etc.  are  to  be  interpreted  modulo  32. 
In  general,  timing  and  wait  numbers  are  simpler  than  they  appear 
from  the  definitions  because  they  are  very  frequently  both  zero, 
corresponding  to  a  transfer  for  one  minor  cvcle.  The  detailed 
coding  of  the  problem  given  earlier  will  illustrate  the  procedure. 
All  the  instructions  are  in  DLl  so  that  the  next  instruction  source 
is  always  one.  The  key  to  the  headings  in  the  following  table  is: 


m.c. 

Minor  cycle  position  of  instructions  in  DLl 

N.I.S. 

Next  instruction  source 

S 

Source 

D 

Destination 

C 

Characteristic 

W 

Wait  number 

T 

Timing  number 

The  last  column  gives  the  position  of  the  next  instruction  in  DLl; 
it  is  given  by  (m  -|-  T  -|-  2).  The  first  4  instructions  occupy  minor 
cycles,  0,  2  and  4,  6  and  each  takes  two  minor  cycles,  and  gives 
a  transfer  for  one  minor  cycle^only.  The  next  instruction  occupies 
minor  cycle  number  8  and  it  requires  a  transfer  lasting  3  minor 
cycles.  The  simplest  and  fastest  wav  of  getting  this  is  to  have 
W  =  0  and  T  =  2  giving  a  transfer  of  (2  —  0  -|-  1)  minor  cycles. 
The  next  instruction  is  in  position  (8-1-2-1-  2),  that  is  minor  cycle 
12,  and  so  on.  When  we  reach  the  instruction  in  minor  cycle  31, 
viz.  25-17,  a  transfer  for  one  minor  cycle  is  required.  The  simplest 
wav  is  to  have  W  =  0  T  =  0  and  this  makes  the  next  instruction 
occupy  position  (31  +0  +  2)  i.e.  position  33  which  is  position  1. 
If  position  1  had  been  already  occupied,  a  value  of  T  could  have 
been  chosen  in  order  to  land  in  an  unoccupied  position.  In  order 
to  ensure  that  a  transfer  of  one  minor  cycle  only  took  place,  the 
characteristic  could  have  been  made  1.  It  should  be  appreciated 
that  the  choice  of  C,  W  and  T  is  far  from  unique.  Whenever 
possible  T  =  0  and  W  =  0  are  chosen  because  this  gives  the 
highest  speed  of  operation  besides  being  simplest.  The  instruction 
occupying  position  1  is  of  special  interest  because  this  is  the  last 
instruction  of  the  cycle  needed  to  build  up  a  square  and  cube  and 
it  must  select  as  its  next  instruction  the  first  of  the  cvcle,  which 
is,  in  position  number  6.  This  is  achieved  by  making  T  =  3  (giving 
the  next  instniction  in  m.c.  1-1-3-1-2  =  6).  This  incidentally 
gives  a  transfer  lasting  four  minor  cycles  but  since  it  is  a  transfer 
from  one  TS  to  another  and  no  fimctional  source  or  destination 
is  in  use,  the  prolonged  transfer  produces  no  harmful  effect.  If  a 
prolonged  transfer  had  to  be  avoided  then  the  characteristic  could 
be  taken  as  1.  It  is  seldom  necessary  to  use  any  characteristic  other 
than  zero  for  transfers  to  and  from  TS's  but  when  transfers  are 
made  to  and  from  DL's,  characteristic  values  of  1  or  3  are  almost 
universal.  All  12  instructions  which  comprise  the  repeated  cycle 
of  the  computation  take  a  total  time  of  one  major  cycle  exactly 
(32  minor  cycles)  the  last  instruction  of  the  cycle  having  been 
specially  designed  to  get  back  to  the  beginning  of  the  cvcle.  This 
is  in  contrast  to  the  position  in  a  machine  not  using  optimum 
coding,  where  12  major  cycles  would  be  necessary  quite  apart  from 
the  fact  that  the  multiplications  by  factors  of  3  and  2,  each  of 
which  uses  one  instniction,  would  normally  need  more  than  one 
instruction  if  a  prolonged  transfer  were  not  available.  Figure  1 
gives  a  simplified  diagram  of  the  machine.  The  sequence  of  events 
in  obeying  the  instruction 

N    S  D     C    W  T 

2     16    -    2C    0     8  10 

occupying  DLU  for  example  is  as  follows.  Starting  from  the  time 
when  the  last  instruction  was  completed,  the  instruction  from 


Chapter  11  |  The  Pilot  ACE  197 


Minor  cycle  Minor  cycle 


position  of 
instructions 
in  DLl 

Next 

imtrtwtion 

source 

Source 

Destination 

Charac- 
teristic 

Wait 

no. 

Timing 

no. 

position  of 
next 

itistruction 

0 

1 

28 

15 

0 

0 

0 

(2) 

1 

1 

16 

15 

0 

0 

3 

(6) 

2 

1 

28 

20 

0 

0 

P 

(4) 

3 

4 

1 

28 

16 

0 

0 

0 

(6) 

5 

6 

1 

26 

16 

0 

0 

0 

(8) 

7 

8 

1 

20 

17 

0 

0 

2 

(12) 

9 

10 

11 

12 

1 

15 

17 

0 

0 

2 

(16) 

13 

14 

15 

16 

1 

25 

17 

0 

0 

0 

(18) 

17 

18 

1 

16 

26 

0 

0 

0 

(20) 

19 

20 

1 

20 

16 

0 

0 

0 

(22) 

21 

22 

1 

15 

17 

0 

0 

1 

(25) 

23 

24 

0^ 
^0 

1 
1 

1  / 

r\ 

yj 

r\ 
U 

26 

27 

1 

16 

20 

0 

0 

0 

(29) 

28 

29 

1 

15 

16 

0 

0 

0 

(31) 

30 

31 

1 

25 

17 

0 

0 

0 

(1) 

DLL,  will  have  passed  into  the  special  TS  marked  TS  COUNT 
during  minor  cycle  number  2.  Bv  the  end  of  minor  cvcle  number 
3,  S  switch  number  16  will  be  over  and  also  N  switch  number 
2.  The  contents  of  TS16  will  be  passing  into  HIGHWAY  and  those 
of  DL2  into  INSTRUCTION  HIGHWAY.  At  the  beginning  of 
minor  cycle  number  12  (i.e.  2  +  8  +  2),  D  switch  number  20  will 
go  over,  and  TS20  will  stop  recirculating  and  the  number  on  the 
HIGHWAY  will  pass  into  TS20.  The  transfer  will  continue  until 
minor  cycle  14  (i.e.  2+10  +  2)  when  the  D  switch  number  20 
will  switch  back.  At  the  beginning  of  minor  cycle  14,  the  switch 
X  on  COUNT  will  go  over  and  the  number  on  INSTRUCTION 
HIGHWAY  during  this  minor  cycle,  DL2i^,  will  pass  into  COUNT. 
At  the  end  of  minor  cycle  14,  the  X  switch  will  close  again  and 


DL2j^  will  be  trapped  in  COLTNT.  The  cycle  of  events  is  now 
complete.  COUNT  is  associated  with  a  counter  and  it  is  this 
counter  which  determines  from  the  wait,  timing,  and  characteristic 
numbers  of  the  trapped  instruction,  when  the  D  and  X  switches 
go  over  and  back. 

Input  and  output 

The  only  part  of  the  instmction  word  not  described  is  the  GO 
digit.  If  the  GO  digit  is  a  one,  the  instruction  is  carried  out  at 
high  speed,  but  if  it  is  a  zero  the  machine  stops  and  does  not 
proceed  until  a  manual  switch  is  operated.  The  GO  digit  is  omitted 
in  strategic  instructions  when  a  programme  is  being  tested.  It  also 


198  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  I  Processors  with  greater  than  1  address  per  instruction 


HIGHWAY 


-^y'*  <   FROM    HOLLEfllTH  READER 

SO 


DL  3  to 


/.    ^      ...  ? 


Fig.  1.  Simplified  diagram  showing  some  sources,  destinations,  and 
next-instruction  sources. 

serves  a  further  purpose  in  synchronising  the  input  and  output 
facilities  with  the  high  speed  computer.  Input  on  the  machine  is 
by  means  of  Hollerith  punched  cards.  When  cards  are  passed 
through  the  reader  the  numbers  on  the  card  may  be  read  row  by 
row  as  each  passes  under  a  set  of  .32  reading  brushes.  When  a  row 
of  a  card  is  under  the  reading  brushes,  the  number  punched  on 
that  row,  regarded  as  a  number  of  32  binary  digits,  is  available 
on  source  0.  In  order  to  make  certain  that  reading  takes  place 
when  a  row  is  in  position  and  not  between  rows,  transfers  from 
source  0,  have  the  GO  digit  omitted  and  it  is  arranged  that  the 
Hollerith  reader  has  the  same  effect  as  operating  the  manual 
switch  each  time  a  row  comes  into  position.  The  passage  of  a  card 
through  the  reader  is  called  for  by  a  transfer  from  any  source  to 
destination  .31.  No  transfer  of  information  from  the  card  takes  place 
unless  the  appropriate  instruction  using  source  0  is  obeyed  during 
the  passage  of  the  card.  Output  on  the  machine  is  also  provided 


by  a  Hollerith  punch.  The  passage  of  a  card  through  the  punch 
is  called  for  by  a  transfer  from  any  source  to  destination  30.  While 
a  card  is  passing  through  the  punch  a  32  digit  number  may  be 
punched  on  each  row  by  a  transfer  to  destination  28.  Again  syn- 
chronisation is  ensured  by  omitting  the  GO  digit  in  instructions 
calling  for  a  transfer  to  destination  28,  and  arranging  that  the 
Hollerith  punch  effectively  operates  the  manual  switch  as  each 
row  comes  into  position.  The  reader  feeds  cards  at  the  rate  of  200 
cards  per  minute  and  the  punch,  at  the  rate  of  100  cards  per 
minute.  The  speed  of  input  for  binary  digits  is  200  X  32  X  12  per 
minute  or  1280  per  second.  The  output  speed  is  640  digits  per 
second.  Data  may  be  fed  in  and  out  in  decimal,  but  it  then  requires 
conversion  subroutines.  The  computation  involved  in  the  conver- 
sion is  done  between  the  rows  of  the  card  and  up  to  30  decimal 
digits  per  card  may  be  translated.  This  speed  of  conversion  is  only 
possible  because  of  the  use  of  optimum  coding.  The  facility  for 
carrying  out  computation  between  rows  of  cards  is  used  extensively 
particularly  in  linear  algebra  when  matrices  exceeding  the  storage 
capacity  of  the  machine  are  involved.  The  matrices  are  stored  on 
cards  in  binary  form  with  one  number  on  each  of  the  12  rows  of 
each  card,  all  the  computation  being  done  either  between  rows 
when  reading  or  when  pvmching.  Times  comparable  with  those 
possible  with  the  matrices  stored  in  the  memory  are  often  achieved 
in  this  way,  when  the  computation  uses  a  high  percentage  of  the 
available  time  between  rows.  Up  to  80%  of  this  time  may  be  safelv 
used. 

Initial  Input 

The  initial  input  of  instructions  is  achieved  by  choosing  destination 
0  in  a  special  manner.  When  a  transfer  is  made  to  destination  0, 
then  the  instruction  transferred  becomes  the  next  to  be  obeyed 
and  the  next  instruction  source  is  ignored.  Source  0  has  already 
been  chosen  specially  since  it  is  provided  from  a  row  of  a  card. 
The  instruction  consisting  of  zeros  has  the  effect  of  injecting  the 
instruction  punched  on  a  row  of  a  card  into  the  machine  as  the 
next  to  be  obeyed.  The  machine  is  started  by  clearing  the  store 
and  starting  the  Hollerith  reader  which  contains  cards  punched 
with  appropriate  instructions.  Destination  0  is  also  used  when  an 
instruction  is  built  up  in  an  arithmetic  unit  ready  to  be  obeyed. 

Miscellaneous  sources  and  destinations 

Destination  29  controls  a  buzzer.  If  a  non-zero  number  is  trans- 
ferred to  destination  29  the  buzzer  sounds. 

Source  30  is  used  to  indicate  when  the  last  row  of  a  card  is 
in  position  in  the  reader  or  punch.  This  .source  gives  a  non-zero 
number  only  when  a  last  row  is  in  position.  The  operation  of  the 
arithmetic  facilities  on  DS14  may  be  modified  by  a  transfer  to 


Chapter  11  |  The  Pilot  ACE  199 


destination  23.  If  a  transfer  with  an  odd  characteristic  is  made 
from  any  source  to  destination  23  then,  from  then  on,  DS14  be- 
haves as  though  it  were  two  single  length  accumulators  in  series. 
This  means  that  carries  are  suppressed  at  the  end  of  each  of  the 
single  words.  This  condition  persists  until  a  transfer  is  made  to 
destination  23  using  an  even  characteristic,  when  DS14  behaves 
as  an  accumulator  for  double  length  numbers  with  their  least 
significant  parts  in  even  minor  cvcles  and  more  significant  parts 
in  odd  minor  cycles. 

The  operation  TS2()  is  modified  by  transfers  to  destination  21. 
If  a  transfer  with  an  odd  characteristic  is  made  to  destination  21 
then  TS20  ceases  to  have  an  independent  existence  and  from  then 
on  is  fed  continuously  from  DLK).  Source  20  then  gives  the  con- 
tents of  DLIO  one  minor  cycle  later  than  from  source  10.  TS2() 
reverts  to  its  former  condition  when  a  transfer  with  an  even  char- 
acteristic is  made  to  destination  21.  The  facility  is  used  to  move 
the  32  words  in  DLIO  round  one  position  so  that  the  word  in  minor 
cycle  n  is  available  in  minor  cycle  (n  -I-  1)- 

Assessment  of  optimum  coding 

A  detailed  assessment  of  the  value  of  optimum  coding  is  by  no 
means  simple.  Roughly  speaking,  subroutines  are  on  an  average 
about  4  or  5  times  as  fast  as  on  an  orthodo.x  machine  using  the 
same  pulse  repetition  rate.  In  main  tables  a  somewhat  lower  factor 
is  usually  achieved.  The  factor  of  4  or  5  would  be  exceeded  if  less 
of  the  advantage  given  by  optimum  coding  were  used  to  overcome 
disadvantages  due  to  the  rudimentary  nature  of  the  arithmetic 
facilities  on  Pilot  ACE.  Even  so,  the  bald  statement  of  the  average 
ratio  of  speeds  does  not  do  full  justice  to  the  value  of  optimum 
coding  on  the  Pilot  .^CE.  Its  value  springs  as  much  from  the  fact 
that  it  has  made  possible  the  programmes  in  which  computing 
is  done  between  the  rows  of  cards  and  also  the  high  output  speed 
of  decimal  numbers.  The  binary  decimal  conversion  routines  for 
punching  out  several  decimal  numbers  simultaneously  on  a  card 
and  also  decimal-binary  conversion  routines  for  reading  several 
numbers,  achieve  a  ratio  of  something  like  14  to  1,  and  on  a 
machine  which  is  being  used  extensively  for  scientific  computation 
on  a  commercial  basis  this  is  of  immense  importance. 

Future  programme 

Engineered  versions  of  the  Pilot  Model  are  now  under  construction 
by  the  English  Electric  Company.  These  machines  will  be  similar 
to  the  Pilot  Model  but  will  have  a  little  more  high-speed  store, 
an  automatic  divider,  two  quadruple  length  stores  and  a  subtrac- 
tive  input  on  the  double  length  accumulator  besides  several  minor 
modifications  including  a  rationalization  of  the  numbering  of  the 


stores!  In  addition  a  magnetic  drum  intermediate  store  with  the 
equivalent  of  32DL"s  storage  capacity  will  be  added.  A  full  scale 
machine  will  probably  soon  be  under  development  employing  a 
4  address  code.  Typical  instructions  will  be  of  the  form 

A  ±  B  C 

and  will  select  the  next  source  of  instruction.  This  code  is  more 
economical  in  instruction  storage  space  and  since  all  single  word 
stores  will  then  become  complete  accumulators  with  all  facilities 
except  multiplication  on  them,  it  will  be  po.ssible  to  take  much 
fuller  advantage  of  optimum  coding. 

Sources,  destination  and  next  instruction  sources 

Sources  Destinutums  \cxt  imtr.  sources 


0. 

Input 

0. 

INSTRUCTION 

0. 

DLl  1 

1. 

DLl 

1. 

DLl 

1. 

DLl 

2. 

DL2 

2. 

DL2 

2. 

DL2 

3. 

DL3 

3. 

DL3 

3. 

DL3 

4. 

DL4 

4. 

DL4 

4. 

DL4 

5. 

DL5 

5. 

DL5 

5. 

DL5 

6. 

DL6 

6. 

DL6 

6. 

DL6 

7. 
8. 

DL7 
DL8 

7. 
8. 

DL7 
DL8 

7. 

DL7 

9. 

DL9 

9. 

DL9 

10, 

DLIO 

10. 

DLIO 

11. 

DLll 

11. 

DLll 

12. 

DS12 

12. 

DS12 

13. 

DS14  +  2 

13. 

DS14  add 

14. 

DS14 

14. 

DS14 

15. 

TS15 

15. 

TS15 

16. 

TS16 

16. 

TS16 

17. 

TS26 

17. 

TS16  add 

18. 

TS26  ^  2 

18. 

TS16  subtract 

19. 

TS26  X  2 

19.t 

MULTIPLY 

20. 

TS20 

20. 

TS20 

21. 

TS26  &  TS27 

21. 

Modifies  Source  20 

22. 

TS26  ^  TS27 

22. 

23. 

P17 

23. 

Modifies  Source  13. 
Destination  13 

24. 

P32 

24. 

DISCRIMINATE  on  sign 

25. 

PI 

25. 

DISCRIMINATE  on  zero 

26. 

TS26 

26. 

TS26 

27. 

TS27 

27. 

TS27 

28. 

Zero 

28. 

Output 

29. 

Ones 

29. 

BUZZER 

30. 

Last  row  of  card 

30. t 

PUNCH 

31. 

31. t 

READ 

t  Independent  of  source  used. 


References 

\\'ilkj.5.3;  TuriS.59 


Chapter  12 

ZEBRA,  a  simple  binary  computer^ 


W.  L.  van  der  Poel 


Summary  The  computer  ZEBRA  is  a  computer  based  on  the  following 
ideas: 

1.  The  logical  structure  of  the  arithmetic  and  control  units  of  the 
machine  have  been  simplified  as  much  as  possible;  there  is  not  even 
a  built-in  nuiltiplier  nor  a  divider. 

2.  The  separate  bits  in  an  instruction  word  are  used  fimctionally  and 
can  be  put  together  in  any  combination. 

3.  Conventional  two  stage  operation  (set-up,  e.\ecution)  has  been  aban- 
doned. Each  unit  time  interval  can  be  used  for  arithmetical  opera- 
tions. 

4.  A  small  number  of  fast  access  registers  is  used  as  temporary  storage; 
at  the  same  time  these  registers  serve  as  modifier  registers  (B-lines). 

5.  Optimum  programming  is  almost  automatically  done  to  a  very  great 
extent.  The  percentage  of  word  times  effectively  used  is  usually 
greater  than  60%. 

6.  An  instruction  can  be  repeated  and  modified  while  repeated  by 
using  an  accumulator  as  next  instruction  source  and  the  address 
counter  as  counter.  This  can  be  done  without  any  special  hardware. 

This  has  resulted  in  a  machine  which  has  a  very  simple  structure  and  hence 
contains  only  a  very  moderate  number  of  components,  giving  high  relia- 
bility and  easy  maintenance.  Because  of  the  fiuictional  bit  coding,  the 
programming  is  extremely  flexible.  In  fact  the  machine  code  is  a  sort  of 
micro-programming.  Full-length  nmltiplication  or  half-length  multiplica- 
tion in  half  the  time  are  just  as  easy,  only  require  a  different  micro- 
programme.  The  minimum  latency  programming  together  with  the  effec- 
tive use  of  word  times  lost  in  other  systems  results  in  a  very  high  speed 
of  operation  compared  to  the  basic  clock  pulse  frequency. 


Introduction 

In  the  Dr.  Neher  Laboratory  of  the  Dutch  Postal  &  Telecom- 
munications Services  the  logical  design  of  a  computer  called  ZE- 
BRA has  been  developed,  and  this  computer  has  been  engineered 
and  constructed  by  Standard  Telephones  &  Cables  Ltd,  England. 
The  logical  system  is  so  different  from  most  computers,  that  it 
is  worth  while  to  devote  a  special  lecture  to  it.  As  time  is  limited, 

'Proc.  ICIP,  UNESCO,  pp.  .361-365,  Jmie,  1959. 


no  technical  details  nor  questions  about  dimensions  or  capacity 
will  be  discussed.  They  can  all  be  found  in  the  literature  [van 
der  Poel,  1956;  van  der  Poel,  1952]. 

The  main  idea  of  the  machine  is  to  economise  as  far  as  possible 
on  the  number  of  components  by  simplifying  the  logical  structure. 
For  example,  multiplication  and  division  are  not  built  in  but  must 
be  programmed.  Of  course  this  system  can  only  work  with  an 
appropriate  internal  code  which  has  enough  properties  to  execute 
basic  arithmetic  and  logical  routines  effectively.  In  fact,  the  inter- 
nal machine  code  is  more  or  less  a  system  of  microprogramming 
[Wilkes  and  Stringer,  1953]. 

Operation  part  of  the  instruction 

The  most  conspicuous,  but  probably  not  the  most  important, 
characteristic  is  the  fimctional  use  of  the  separate  bits  in  the 
operation  part  of  an  instruction.  An  instruction  word  in  ZEBRA 
is  composed  as  follows: 

15  bits  5  bits 

AKQLRIBCDeI  VX4X2X1  I  W  00000 
I   test  bits  I         fast  store 
operation  part  address 

13  bits 

X  X  X  X  X  X  X  X  X  X  X  X  X 

drum  store  address 

It  is  a  binary,  two-address  machine  with  one  address  of  13  bits 
for  the  selection  of  a  location  in  the  main  store  (a  dnmi  of  8192 
locations  divided  into  256  tracks  of  32  words  each),  and  a  second 
address  of  5  bits  for  the  selection  of  one  of  12  fast  access  store 
registers  and  several  permanently  wired  locations  (e.g.,  input, 
output,  accumulators,  constants).  The  operation  part  has  15  bits, 
each  one  having  a  separate  and  independent  meaning.  The  most 
important  of  these  are  the  A,  K,  D  and  E  bits. 

A-  and  K-bits 

There  are  four  main  components  in  the  machine:  the  drum  store, 
the  fast  store,  the  arithmetic  unit  and  the  control.  The  A-bit  in 
the  instruction  controls  the  interconnection  of  the  drum  and  the 


Chapter  12  i  ZEBRA,  a  simple  binary  computer  201 


Fig.  1.  The  main  units  of  the  computer. 

arithmetic  unit  or  the  control.  In  the  same  way  the  K-bit  controls 
the  interconnection  of  the  fast  store  with  the  arithmetic  unit  or 
the  control  unit.  These  interconnections  can  be  seen  from  Fig.  1. 

It  will  lie  seen  that  \  and  K  can  have  -1  possible  combinations: 

Cfl.se  i.  A  =  0,  K  =  0.  This  is  called  the  addins;  jump  (Fig.  2a]. 

While  a  new  instniction  is  coming  into  the  control  from  the  dnun. 
the  arithmetic  unit  can  at  the  same  time  do  an  operation  with 
the  operand  coming  from  the  fast  store.  This  is  the  fastest  type 
of  operation.  When  the  following  instruction  is  placed  in  the  next 
location  on  the  drum  there  is  no  waiting  time,  and  .32  instnictions 
of  this  type  can  be  executed  per  revolution.  (One  revolution  =  10 
ms,  one  word  time  =  .312  fts.l 

Case  2.  .\  =  0,  K  =  1.  This  is  called  the  double  jimip  (Fig.  2/)). 

Both  stores  are  now  used  for  giving  information  to  the  control, 
i.e.,  making  a  jump.  Since  the  fast  store  is  used  for  the  control, 
the  instniction  coming  in  from  the  dmm  is  modified  by  the  con- 
tents of  a  fast  register.  In  this  way  the  B-line  facility,  as  it  is  often 
called,  is  realised. 

Case  .3.  A  =  1.  K  =  0.  This  is  called  the  double  addition  (Fig.  2c). 

Both  stores  are  now  connected  to  the  arithmetic  unit.  The  control 
must  take  care  of  itself  using  the  address  counter  which  is  stepped 
up  by  2  at  a  time,  thus  enabling  this  type  of  instniction  to  reach 
the  number  King  between  the  two  successive  instnictions  without 
anv  waiting  time.  Constants  in  particular  will  always  be  taken 
from  optimum  places  on  the  drum. 

Case  4.  .\  =  1,  K  =  1.  This  is  called  the  jumping  addition  (Fig. 
2d). 

While  the  dmm  is  used  for  the  arithmetic  unit  the  address  counter 
is  modified  by  a  fast  register.  Control  may  thus  be  passed  to  any 
instruction,  and  not  onlv  to  the  next  instniction. 


D-  and  E-hits 

The  functional  bits  D  and  E  control  the  direction  of  flow  of  infor- 
mation. 

D  =  0  means:  read  from  the  drum. 
E  =  0  means:  read  from  the  fast  store. 
D  =  1  means:  write  to  the  drum. 
E  =  1  means:  write  to  the  fast  store. 

\  few  possible  instnictions  will  be  given  below.  In  the  written 
code  a  dmm  address  will  alwavs  be  written  with  .3  or  more  digits 
and  the  absence  of  the  .\-bit  will  be  indicated  by  the  letter  X. 
(This  is  necessary  for  the  input  programme  to  recognize  the  be- 
ginning of  a  new  instmction.l 

.•V2()()..5  Add  <2()()>  ithe  contents  of  address  2(»())  and  (.5) 

to  the  accumulator.  Step  the  address  counter 
by  2. 

X200E5  Take  next  instniction  from  2()()  (  =  jump  to  200) 

and  store  contents  of  accumulator  in  .5. 

.\2()0KE.5  Jump  to  200  and  store  previous  contents  of  ad- 
dress counter  in  5.  This  amounts  to  placing  a  link 
instmction  for  return  from  a  sub-routine. 

X2()0K5  Take  next  instniction  from  200  but  modify  it  with 
(.5)  thus  making  a  variable  instniction. 

Arithmetic  bits 

The  remainder  of  the  function  bits  have  arithmetic  meanings.  We 
shall  only  briefly  indicate  their  different  actions. 

B:  Do  not  use  the  .\  accumulator  (most  significant  accumulator) 
but  the  B  accumulator. 


Fig.  2.  The  possible  combinations  of  the  A-  and  K  bits. 


202  Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor  Section  1  [  Processors  with  greater  than  1  address  per  instruction 


C:  Clear  the  accuniiilator  specified  by  B  after  storing,  or  before 
addition.  (In  a  serial  machine  like  ZEBRA  this  is  auto- 
matically the  case,  cf.  Fig.  3.) 

I:  Subtract  instead  of  add. 

Q:  Add  one  (unit  in  the  least  significant  place)  to  the  B-accu- 
mulator. 

L:  Shift  both  accumulators  one  place  to  the  left. 

R:  Shift  both  accumulators  one  place  to  the  right.  The  accu- 
mulators are  always  coupled  together  in  shifting  except 
when  C  is  present. 

A  few  more  examples  will  be  given. 

A200BCE25  Store  <B>  in  5,  clear  B  and  add  <200> 

to  B. 

X200QLIBCE6  Jump  to  200.  Store  <B>  in  6,  put  - 1  in  B 
(because  of  QIBC)  and  shift  the  .\  accumu- 
lator one  place  to  the  left.  Shifting  from  B 
into  A  is  prevented  by  the  presence  of  C. 

X200RBC3  Jump  to  200.  Shift  A  to  the  right.  Copy  <3> 

into  B.  As  register  3  is  just  an  address  for 
the  B  accumulator  itself,  this  means  that 
A  is  shifted  while  B  is  static. 

X200K3QIBC  Take  the  instruction  from  200  and  modify 
it  with  the  contents  of  the  B  accumulator 
( =  register  3).  Put  —  1  in  B  afterwards. 


As  can  be  seen,  many  complicated  operations  can  be  composed 
by  the  elementary  possibilities  of  the  separate  bits. 

The  accumulator 

.\  simplified  block  diagram  of  one  of  the  accumulators  is  shown 
in  Fig.  3. 

Shifting  is  effected  by  looping  the  accumulator  over  one  place 
less  or  one  place  more.  In  a  double  addition  the  contents  of  the 
drmn  store  and  the  fast  store  are  first  added  together  in  the  pre- 
adder  (possiblv  augmented  bv  unity  in  the  B  accumulator,  if  Q 
is  present)  and  this  result  is  added  into  the  accumulator  (or  sub- 
tracted in  case  of  I).  A  clearing  gate  controlled  by  C  interrupts 
the  recirculation  of  the  previous  contents. 

The  control  unit 

The  control  unit  has  two  shifting  registers,  the  C-register  which 
receives  the  next  instruction  to  be  executed  and  the  D-register 
or  counter.  The  block  diagram  is  shown  in  Fig.  4.  After  a  new 
instruction  has  come  into  C,  it  is  taken  over  in  parallel  form  into 
E  in  the  interword  time.  It  remains  in  E  while  the  next  instruction 
is  coming  into  C.  Let  us  explain  the  action  of  this  control  with 
a  short  programme. 

Examples  of  programmes 

100  X101E5 

101  AC102 

102  constant 

103  etc. 


The  actions  in  the  several  registers  are  now: 
<A>    <C>  <D> 


xioo- 


X101E5  X102 


AC  102  X103E5 


const.  X103E5 


Fig.  3.  Accumulator. 


Suppose  XlOO  is  in  C  at  the  start. 

This  will  take  (100)  into  C.  <C>  -t-  2  ^  D. 

Another  jump  comes  into  C  taking  in  (101) 

and  storing  <A)  — >  5. 

<C>  -I-  2  ^  D  gives  X103E5. 

Note  that  the  operational  part  is  kept  in  the 
counter.  The  necessary  constant  from  102  is 
just  becoming  available. 

The  next  instruction  is  taken  from  103  which 
is  immediately  following.  The  constant  in 
A  is  stored  to  5  bv  E5,  and  is  still  active 
after  coming  back  from  D. 


Chapter  12  |  ZEBRA,  a  simple  binary  computer  203 


Drum  store  Fast  store!       i       I  *2 


h  store 


Fig.  4.  Control  unit. 


This  is  the  most  important  aspect  of  the  machine.  An  instniction 
in  the  address  counter  comes  back  after  an  A-instruction  and  can 
do  something  useful.  To  our  surprise  we  found  that  in  many  more 
cases  than  we  first  suspected,  the  second  action  could  be  used 
effectivelv.  In  most  other  computers  the  time  of  access  to  the  next 
instruction  is  lost  because  nothing  can  be  done  concurrently  in 
the  arithmetic  unit. 

Another  example  of  the  action  of  the  control  is  the  jump  to  a 
sub-routine.  Suppose  that  we  have  the  following  piece  of  pro- 
gramme: 

100    X200KE5      Jump  to  sub-routine  starting  in  200.  Place 
return  jump  in  5. 


102  etc. 


Sub-routine  returns  here. 


Tlie  action  is  as  follows: 


<C>  <D> 
XlOO  


The  instniction  is  taken  from  100. 


X200KE5    .\102    X200KE5      C  and  XlOO  +  2^D.  Now 
KE.5  stores  D  in  5.  Thus  <5>  =  X102. 


(200) 


XK5 

X102 
(102) 


The  subroutine  at  200  is  e.xecuted  and  ends 
with  XK5:  jump  to  5. 

Take  instruction  from  5. 

Now  the  main  programme  proceeds  to  102 
etc. 


B\-  ending  the  sub-routine: 

220  X221K.5 

221  -  1 


we  can  return  not  two  but  one  location  further  on,  i.e.,  .X221K.5 
takes  as  next  instruction  <.5>  —  1  =  .XlOl.  Here  5  contains  the 
instruction  and  the  drum  modifier. 


The  test  bits 

The  digits  V  x,,  x.,  x,  will  not  be  dealt  with  extensively  but  the 
different  combinations  of  these  4  digits  represent  different  types 
of  test.  When  for  example  VI  is  attached  to  an  instruction,  this 
instruction  will  be  executed  when  {A)  is  negative,  but  will  be 
skipped  altogether  when  (,\)  is  positive  or  zero.  The  harmless 
A-instruction  will  then  be  executed  instead.  The  test  can  be  at- 
tached to  a  jump,  giving  a  conditional  jump,  as  well  a.s  to  an 
.\-instruction,  giving  a  conditional  addition. 


The  W-bIt 

.So  far  the  digit  \\  has  not  been  mentioned.  W  hen  W  is  present 
in  an  instniction  the  drum  address  is  not  used.  The  instruction 
is  not  kept  waiting  but  is  immediately  executed  and  the  drum  is 
completely  disregarded.  With  the  help  of  this  digit  W,  jumps  can 
be  made  to  instnictions  in  the  fast  store,  e.g.,  XK5W  takes  the 
instniction  from  5  only,  and  the  drum  does  not  deliver  any  number. 
The  use  of  this  type  of  instniction  has  very  peculiar  consequences. 
Let  us  take  the  following  example: 


100  X]()1KE6 

101  X8186K5RW 

102  etc. 


<5> 
<6> 


ARW 

filled  with  return  instniction 


The  action  is  as  follows: 

<A>    <C>  <D) 
a         XlOO  1 


X101KE6  X102 


X8186K5WRN 


Take  instniction  from  100. 

Jump  to  101  and  store  return 
instruction  XI02  in  6. 

Do  I  risrht  shift. 


Via 


ARW    X8188K5RW  Do  another  right  shift  by  ARW. 

The  drum  address  in  D  is 
counted  up  but  is  not  active. 
The  register  address  remains 
the  same.  Hence  the  instruc- 
tion in  5  is  repeated. 


204  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  j  Processors  with  greater  than  1  address  per  instruction 


2-3 -a 
2-* -a 
2-5  •  a 


•a  XSISSKSRW  The  repeating  instruction  as 

I  well  as  the  repeated  instruction 

are  both  shifted  one  place  to 
the  riirht. 


ARW  X8190K5RW 
X8190K3RW\^ 
ARW  X000K6RW 


As  the  dnuii  address  overflows 
into  the  fast  store  address  the 
repeating  instruction  becomes 
X8192K5R\V  =  X000K6RW 
taking  the  next  instruction  from 


■  a  X000K6RW 


i-''  ■  a  X102  As  <6>  -  X102  the  repetition 

returns  to  the  main  programme 
and  the  A  accumulator  is  shifted 
over  7  places. 

The  instruction  ARW  has  thus  been  repeated  p  times  when  the 
dnmi  address  of  the  repeating  instruction  is  8192 — 2p.  This  way 
of  repeating  an  instruction  has  made  it  possible  to  do  multipli- 
cation, division,  block  transfers,  table  look  up  and  many  other 
small  basic  repetitive  processes  in  a  very  simple  way.  There  is  no 
special  hardware  present  in  the  machine  to  do  the  counting  neces- 
sary for  the  repetition,  as  this  counting  is  done  by  the  normal 
address  counter. 

As  a  last  example  we  shall  give  a  programme  for  the  summa- 


tion of  a  block  of  locations  from  200  to  300  in  the  store.  This 
involves  101  locations.  The  programme  reads: 


100  AlOlBC 

101  A200Q 


Put  A2()0Q  in  B  (B  has  address  3). 


102  X103KE4C        Put  return  jump  X104  in  4.  Clear  A  in 
advance. 


103  X7990K3W 


104  etc. 


Repeat  A200Q  101  times.  Because  A200Q 
is  standing  in  B  the  Q  augments  the  in- 
struction itself  at  every  repetition.  Hence 
successively  (200),  <201)  etc.  are  added 
to  A.  At  the  end  the  sum  is  left  in  A  and 
the  programme  proceeds  at  104. 


It  is  left  to  the  reader  to  work  out  the  action  diagram. 

This  example  is  not  programmed  for  minimum  waiting,  but  by 
supplying  the  repeating  instruction  X7990K3W  with  a  Q  it  will 
step  up  the  repeated  instruction  A200Q  by  2  every  time.  Now, 
once  the  first  instruction  has  been  located,  all  even  locations  follow- 
ing are  emerging  from  the  dmm  just  at  the  right  time.  The  odd 
numbered  locations  must  be  summed  in  a  second,  similar  repeti- 
tion. 

References 

VandW59;  VandW52,  ,56;  WilkM,5.3a. 


Chapter  13 


UNIVAC  Scientific  (1103A) 
instruction  logic^ 

John  W.  Can  III 

The  UNIVAC  Scientific  computer  is  a  (35,  0,  0)^  binary  machine, 
with  option  of  (27,  8,  0).  The  arithmetic  unit  contains  two  36-bit 
X  (exchange)  and  Q  (quotient)  registers  and  one  72-bit  A  register 
(accumulator).  Negative  numbers  are  represented  in  one's  com- 
plement notation. 

Input-output  is  via  high-speed  paper  tape  reader  and  punch, 
direct  card  reader  and  punch,  and  Uniservo  magnetic  tape  units, 
which  may  be  connected  to  peripheral  punched  card  readers  and 
punches  and  a  high-speed  printer.  In  iddition,  information  mav 
be  recorded  on  magnetic  tape  directly  from  keyboards  by  the  use 
of  Unitypers.  (^ommmiication  with  external  ecjuipnient  is  via  an 
8-bit  (lO.^)  register  and  a  36-bit  (lOB)  register.  Information  sent 
to  these  registers  controls  magnetic  tapes  as  well  as  other  input- 
output  equipment.  The  program  address  counter  (PAK)  contains 
the  present  instniction  address.  Storage  is  in  up  to  12,288  locations 
of  magnetic  core  storage,  along  with  a  directly  addressable  dnnn 
of  16,384  locations.  Instructions  are  of  the  two-address  form, 
with  six  bits  for  the  operation  code  and  two  fifteen-bit  addresses 
(u  and  v). 

The  following  information  is  taken  from  a  Univac  Scientific 
Manual  [Univac  Scientific  Electronic  Computing  System  Model 
1103A,  Form  EL338]. 

Definitions  and  conventions 

In.ilrt4Ction  word 


oc 

u 

%• 

6  bits 

1.5  bits 

15  bits 

^In  E.  M.  Grabbe,  S.  Ramo,  and  D.  E.  Wooldridge  (eds.),  "Handbook  of 
Automation,  Computation,  and  Control,"  vol.  2,  chap.  2,  pp.  77-8.3,  John 
Wiley  &  Sons,  Inc.,  New  York.  1959. 

-Carr's  triplet  notation  for:  fractional  significant  digits,  digits  in  exponent, 
and  digits  to  left  of  radix  point. 


oc    Operation  code 

u     First  execution  address 

V     Second  execution  address 

For  some  of  the  instructions,  the  form  jn  or  jk  replaces  the  u  ad- 
dress; for  others  the  form  k  replaces  the  v  address. 

]'    One-digit  octal  number  modiKing  the  instniction 
n    Four-digit  octal  number  designating  number  of  times  in- 
stniction is  to  be  performed 
k    Seven-digit  binary  number  designating  the  number  of  places 
the  word  is  to  be  shifted  to  the  left 


Address  allocations  \  octah 


00000 

-07777 

4096 

MC 

00000- 

-17777 

8192  or 

00000 

-27777 

12,288  36-bit  words 

Q 

31000 

31777 

I  ■36-bit  word 

A 

32000 

37777 

1  72-bit  word 

MD 

40000- 

-77777 

16,384  .36-bit  words 

Fixed  addresses 

F,  00000  or  40001 

F.  00001 

F3  00002 

F^  00003 

Arithmetic  section  registers 

A      72-bit  accumulator  with  shifting  properties 

A^    Right-hand  36  bits  of  A 

Aj,    Left-hand  .36  bits  of  A 

Q     36-bit  register  with  shifting  properties 

X      36-bit  exchange  register 

Xote:  Parentheses  denote  contents  of.  For  example,  (A)  means 
contents  of  A  (72-bit  word  in  A);  (Q)  means  contents  of  Q  (36-bit 
word  in  Q). 


206  Part  3  [  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  I  Processors  vKith  greater  than  1  address  per  instruction 


Input-output  registers 

lOA     8-bit  in-out  register 

lOB     36-bit  in-out  register 

TWR    6-bit  typewriter  register 

HPR     7-bit  high-speed  punch  register 

Word  extension 

D(u)  72-bit  word  whose  right-hand  36  bits  are  the  word  at 
address  u,  and  whose  left-hand  36  bits  are  the  same  as 
the  leftmost  bit  of  the  word  at  u. 

S(u)  72-bit  word  whose  right-hand  36  bits  are  the  word  at 
address  u,  and  whose  left-hand  36  bits  are  zero. 

D(Q)  72-bit  word — right-hand  36  bits  are  in  register  Q,  left- 
hand  36  bits  are  same  as  leftmost  bit  in  register  Q. 

S(Q)    same  as  D(Q)  except  left  36  bits  are  zero. 

D(Aj{),  S(Ajj)    are  similarly  defined. 

L(Q)(u)  72-bit  word — left-hand  .36  bits  are  zero,  right-hand 
36  bits  are  the  bit-by-bit  product  of  corresponding 
bits  of  (Q)  and  word  at  address  u. 

L(Q')(v)  72-bit  word — left-hand  36  bits  are  zero,  right-hand 
36  bits  are  the  bit-by-bit  product  of  corresponding 
bits  of  the  complement  of  (Q)  and  word  at  ad- 
dress V. 

Transmit  instructions 

II'    Transmit  Positive  TPuv^:  Replace  (v)  with  (u). 
13    Transmit  Negative  TNuv:  Replace  (v)  with  the  comple- 
ment of  (u). 

12  Transmit  Magnitude  TMuv:  Replace  (v)  with  the  absolute 
magnitude  of  (u). 

15  Transmit  U-address  TUuv:  Replace  the  15  bits  of  (v)  desig- 
nated by  Vj,j  through  Vjg,  with  the  corresponding  bits  of 
(u),  leaving  the  remaining  21  bits  of  (v)  undisturbed. 

16  Transmit  V-address  TVuv:  Replace  the  right-hand  15  bits 
of  (v)  designated  by  v,,  through  Vj^,  with  the  corresponding 
bits  of  (u),  leaving  the  remaining  21  bits  of  (v)  undisturbed. 

35  Add  and  Transmit  ATuv:  Add  D{u)  to  (A).  Then  replace 
(v)  with  (Ajj). 

36  Subtract  and  Transmit  STuv:  Subtract  D(u)  from  (A).  Then 
replace  (v)  with  (Aj^). 

22  Left  Transmit  LTjkv:  Left  circular  shift  (A)  by  k  places. 
If  j  =  0  replace  (v)  with  (A^);  if  ]  =  I  replace  (v)  with  (Ajj). 

^  Octal  notation. 
^Mnemonic  notation. 


Q-controlled  instructions 

51  Q-controlled  Transmit  QTuv:  Form  in  A  the  number 
L(Q)(u).  Then  replace  (v)  by  (Ag). 

52  Q-controIled  Add  QAuv:  Add  to  (A)  the  number  L(Q)(u). 
Then  replace  (v)  by  (Aj^). 

53  Q-controlled  Substitute  QSuv:  Form  in  A  the  quantity 
L(Q)(u)  plus  L(Q')(v).  Then  replace  (v)  with  (A^).  The 
effect  is  to  replace  selected  bits  of  (v)  with  the  corre- 
sponding bits  of  (u)  in  those  places  corresponding  to  I  s 
in  Q.  The  final  (v)  is  the  same  as  the  final  (Ajj). 

Replace  instructions 

21    Replace  Add  RAuv:  Form  in  A  the  sum  of  D(u)  and  D(v). 

Then  replace  (u)  with  (A^). 
23    Replace  Subtract  RSuv:  Form  in  A  the  difference  D(u) 

minus  D(v).  Then  replace  (u)  with  (Ajj). 
27    Controlled  Complement  CCuv:  Replace  (A^)  with  (u) 

leaving  (A^)  undisturbed.  Then  complement  those  bits  of 

(A^)  that  correspond  to  ones  in  (v).  Then  replace  (u)  with 

(Ar). 

54  Left  Shift  in  A  LAuk:  Replace  (A)  with  D(u).  Then  left 
circular  shift  (A)  by  k  places.  Then  replace  (u)  with  (A-g). 
If  u  =  A,  the  first  step  is  omitted,  so  that  the  initial  content 
of  A  is  shifted. 

55  Left  Shift  in  Q  LQuk:  Replace  (Q)  with  (u).  Then  left 
circular  shift  (Q)  by  k  places.  Then  replace  (u)  with  (Q). 

Split  instructions 

31  Split  Positive  Entry  SPuk:  Form  S(u)  in  A.  Then  left  circu- 
lar shift  (A)  bv  k  places. 

33  Split  Negative  Entry  SNuk:  Form  in  A  the  complement 
of  S(u).  Then  left  circular  shift  (A)  by  k  places. 

32  Split  Add  SAuk:  Add  S(u)  to  (A).  Then  left  circular  shift 
(A)  bv  k  places. 

34  Split  Subtract  SSuk:  Subtract  S(u)  from  (A).  Then  left 
circular  shift  (A)  by  k  places. 

Two-way  conditional  jump  instructions 

46  Sign  Jump  SJuv:  If  A^j  =  I,  take  (u)  as  Nl.  If  Aj,  =  0, 
take  (v)  as  Nl.  (NI  means  ne.xt  instruction.) 

47  Zero  Jump  ZJuv:  If  (A)  is  not  zero,  take  (u)  as  NI.  If  (A) 
is  zero,  take  (v)  as  NI. 


Chapter  13  |  UNIVAC  Scientific  (U03A)  instruction  logic  207 


44  Q-Jump  QJuv:  If  Qj^  =  1,  take  (u)  as  NI.  If  Q35  =  0,  take 
(v)  as  NI.  Then,  in  either  case,  left  circular  shift  (Q)  by 
one  place. 

One-way  conditional  jump  instructions 

41  Index  Jump  IJuv:  Form  in  \  the  difference  D(u)  minus 
1.  Then  if  -A-i  =  1,  continue  the  present  sequence  of  in- 
structions; if  Ajj  =  0,  replace  (u)  with  (A[j)  and  take  (v) 
as  NI. 

42  Threshold  Jump  TJuv:  If  D(u)  is  greater  than  (\),  take  (v) 
as  NI;  if  not,  continue  the  present  sequence.  In  either  case, 
leave  (A)  in  its  initial  state. 

43  Equality  Jump  EJuv;  If  D(u)  equals  (  A),  take  (v)  as  NI, 
if  not,  continue  the  present  sequence.  In  either  case  leave 
(A)  in  its  initial  state. 

One-way  unconditional  jump  instructions 

45  Manually  Selective  Jump  MJjv;  If  the  number  j  is  zero, 
take  (v)  as  NI.  If  j  is  I,  2,  or  .3,  and  the  correspondingly 
numbered  MJ  selecting  switch  is  set  to  "jump,"  take  (v) 
as  NI;  if  this  switch  is  not  set  to  "jump,"  continue  the 
present  sequence. 

37  Return  Jump  RJuv:  Let  v  represent  the  address  from 
which  CI  was  obtained.  Replace  the  right-hand  15  bits  of 
(u)  with  the  quantity  y  plus  1.  Then  take  (v)  as  NI. 

14  Interpret  IP:  Let  v  represent  the  address  from  which  CI 
was  obtained.  Replace  the  right-hand  15  bits  of  (Fj)  with 
the  quantity  y  -I-  L  Then  take  (F,)  as  NI. 

Stop  instructions 

56  Manually  Selective  Stop  MSjv:  If  j  =  0,  stop  computer 
operation  and  provide  suitable  indication.  If  j  =  1,  2.  or 
3  and  the  correspondingh  numbered  MS  selecting  switch 
is  set  to  "stop,"  stop  computer  operation  and  provide 
suitable  indication.  Whether  or  not  a  stop  occurs,  (v)  is 
NI. 

57  Program  Stop  PS — Stop  computer  operations  and  provide 
suitable  indication. 


76  E.xternal  Read  ERjv:  If  j  =  0,  replace  the  right-hand  8  bits 
of  (V)  with  (lOA);  if  j  =  1,  replace  (v)  with  (lOB). 

77  External  Write  EWjv:  If  j  =  {),  replace  (lOA)  with  the 
right-hand  8  bits  of  (v);  if  j  =  1,  replace  (lOB)  with  (v). 
Cause  the  previously  selected  unit  to  respond  to  the  infor- 
mation in  lOA  or  lOB. 

61  PRint  PR-v:  Replace  (TWR)  with  the  right-hand  6  bits  of 
IV).  Cause  the  typewriter  to  print  the  character  corre- 
sponding to  the  6-bit  code. 

63  PUnch  PUjv:  Replace  (HPR)  with  the  right-hand  6  bits 
of  (v).  Cause  the  punch  to  respond  to  (HPR).  If  j  =  0,  omit 
seventh  level  hole;  if  j  =  1,  include  seventh  level  hole. 

Arithmetic  Instructions 

71  Multiply  MPuv:  Form  in  A  the  72-bit  product  of  uO  and 
(V),  leaving  in  Q  the  multiplier  (u). 

72  Multiply  .\dd  MAuv:  .\dd  to  (\)  the  72-bit  product  of  (u) 
and  (v),  leaving  in  Q  the  multiplier  (u). 

73  Divide  DVuv:  Divide  the  72-bit  number  (A)  by  (u),  putting 
the  quotient  in  Q,  and  leaving  in  A  a  non-negative  re- 
mainder R.  Then  replace  (v)  by  (Q).  The  quotient  and 
remainder  are  defined  by:  (A)j  =  (u)  •  (Q)  +  R,  where 
0  ^  R  <  I  (u)  I .  Here  (A)j  denotes  the  initial  contents 
of  A. 

74  Scale  Factor  SFuv:  Replace  (A)  with  D(u).  Then  left  cir- 
cular shift  bv  36  places.  Then  continue  to  shift  (A)  until 
.\34  7^  .^35.  Then  replace  the  right-hand  15  bits  of  (v)  with 
the  number  of  left  circular  shifts,  k.  which  woidd  be  neces- 
sary to  return  (A)  to  its  original  position.  If  (A)  is  all  ones 
or  zeros,  k  =  37.  If  u  is  A.  {A)  is  left  unchanged  in  the 
first  step,  instead  of  being  replaced  by  D(Aj^). 

Sequenced  instructions 

75  RePeat  RPjnw:  This  instruction  calls  for  the  next  instruc- 
tion, which  will  be  called  NIuv,  to  be  executed  n  times, 
its  u  and  v  addresses  being  modified  or  not  according  to 
the  value  of  j.  .\fterwards  the  program  is  continued  by  the 
execution  of  the  instruction  stored  at  a  fixed  address  F^. 
The  exact  steps  carried  out  are; 


External  equipment  Instructions 

17    External  Function  EF-v:  Select  a  unit  of  external  equip- 
ment and  perform  the  function  designated  by  (v). 


a    Replace  the  right-hand  15  bits  of  (Fj)  with  the 
address  w. 


b    Execute  NIuv,  the  next  instruction  in  the  program, 
n  times. 


208  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  I  Processors  viiith  greater  than  1  address  per  instruction 


c    If  j  =  0,  do  not  change  ii  and  v. 

If  j  =  1,  add  one  to  v  after  each  execution. 
If  j  =  2,  add  one  to  u  after  each  execution. 
If  j  —  3,  add  one  to  u  and  v  after  each  execution. 

The  modification  of  the  u  address  and  v  address  is  done 
in  program  control  registers.  The  original  form  of  the 
instruction  in  storage  is  unaltered. 

d  On  completing  n  executions,  take  (F,),  as  the  next 
instniction.  Fj  normally  contains  a  manually  selec- 
tive jump  whereby  the  computer  is  sent  to  w  for 
the  next  instruction  after  the  repeat. 

e  If  the  repeated  instruction  is  a  jump  instruction, 
the  occurrence  of  a  jump  terminates  the  repetition. 
If  the  instruction  is  a  Threshold  Jump  or  an  Equality 
Jump,  and  the  jump  to  address  v  occurs,  (Q)  is 
replaced  by  the  quantity  j,  (n  —  r),  where  r  is  the 
number  of  executions  that  have  taken  place. 


Floating  point  instructions 

64  Add  FAuv:  Form  in  Q  the  normalized  rounded  packed 
floating  point  sum  (u)  +  (v). 

65  Subtract  FSuv:  Form  in  Q  the  normalized  rounded  packed 
floating  point  difference  (u)  —  (v). 

66  Multiply  FMuv:  Form  in  Q  the  normalized  rounded 
packed  floating  point  product  (u)  •  (v). 


67  Divide  FDuv:  Form  in  Q  the  normalized  rounded  packed 
floating  point  quotient  (u)  -h  (v). 

01  Polynomial  Multiply  FPuv:  Floating  add  (v)  to  the  floating 
product  (Q)j  •  (u),  leaving  the  packed  normalized  rounded 
result  in  Q. 

02  Inner  Product  FIuv:  Floating  add  to  (Q).  the  floating 
product  (u)  •  (v)  and  store  the  rounded  normalized  packed 
result  in  Q.  This  instruction  uses  MC  location  F^  =  OOOOS 
for  temporary  storage,  where  (Y^)^  =  (Q)j.  The  subscripts 
i  and  f  represent  "initial"  and  "final." 

03  Unpack  UPuv:  Unpack  (u),  replacing  (u)  with  (u)^  and 
replacing  (v)p  with  (u)p  or  its  complement  if  (u)  is  negative. 
The  characteristic  portion  of  (u),  contains  sign  bits.  The 
sign  portion  and  mantissa  portion  of  (v)f  are  set  to  zero. 
Note.  The  subscripts  M  and  C  denote  the  mantissa  and 
characteristic  portions. 

04  Normalize  Pack  NPuv:  Replace  (u)  with  the  normalized 
rounded  packed  floating  point  number  obtained  from  the 
possibly  unnormalized  mantissa  in  (u)j  and  the  biased 
characteristic  in  (v)^,.  Note.  It  is  assumed  that  (u)j  has  the 
binary  point  between  and  u.,^;  that  is,  that  (u)j  is  scaled 
by  2  ". 

05  Normalize  Exit  NEj-:  If  j  =  1  normalize  without  rounding 
until  a  master  clear  or  until  the  instruction  is  again  exe- 
cuted with  j  =  0. 

References 

Univac  Scientific  Electronic  Computing  System  Model  1 103A,  Form  EL 
338 


Chapter  14 

Instruction  logic  of  the  MIDAC^ 


John  W.  Can  III 


The  MIDAC,  Michigan  Digital  Automatic  Computer  [Carr,  1956], 
was  constructed  on  the  basis  of  the  design  of  the  SEAC  at  the 
National  Bureau  of  Standards.  Its  instruction  code  is  particularly 
of  interest  because  it  incorporates  the  index  register  concept  into 
a  three-address  binary  instruction.  Numbers  in  this  machine  are 
(44,  0,  0)-  fixed  points.  The  word  length  is  45  hinar\  digits  with 
serial  opei'ation. 

Word  structure 

The  data  or  address  positions  of  an  instruction  are  labeled  the  o, 
/?,  and  Y  positions.  Each  contains  twelve  binary  digits  represented 
e.xteniallv  as  three  hexadecimal  digits.  Four  binary  digits,  or  one 
hexadecimal  digit,  are  used  to  convey  the  instruction  modification 
or  relative  addressing  information.  The  next  four  binary  digits  or 
single  hexadecimal  digit  represents  the  operation  portion  of  the 
instniction.  The  final  binary  digit  is  the  halt  or  breakpoint  indi- 
cator for  use  with  the  instniction. 

For  example,  the  4.5-binary-digit  word 

oooooiiooiooooooiiooiooooooiooioiioooooonioii 

considered  as  an  instruction  would  be  interpreted  as 

a  P  y  abed     Op  halt 

000001100100    000011001000   000100101100    0000   0101  1 

In  external  hexadecimal  form  this  would  be  written 

064    0c8    12c    0    5  - 

The  above  binary  word  is  the  equivalent  machine  representation 
of  the  following  instruction:  "Take  the  contents  of  hexadecimal 
address  064,  add  to  it  the  contents  of  hexadecimal  address  Oc8, 
and  store  the  result  in  hexadecimal  address  12c.  There  is  no 
modification  of  the  12-binarv-digit  address  locations  given  bv  the 

'In  E.  M.  Grabbe.  S.  Ranio.  and  D.  E.  W'ooldridge  (eds.).  "Handbook  of 
.\utoination.  Computation,  and  Control,"  vol.  2,  chap.  2.  pp.  115-121. 
John  Wiley  &  Sons,  Inc.,  New  York,  1959. 

"Carr's  triplet  notation  for:  fractional  significant  digits,  digits  in  exponent, 
and  digits  to  left  of  radix  point. 


instruction.  Upon  completion  of  the  operation,  stop  the  machine 
if  the  proper  external  switches  are  energized."  The  binary  com- 
bination represented  by  .5  is  the  operation  code  for  addition. 

Data  or  addresses 

The  addresses  given  by  the  twelve  binary  digits  in  each  of  the 
three  locations  designate  in  the  machine  the  individual  acoustic 
storage  cells  and  blocks  of  eight  magnetic  drum  storage  cells.  The 
addresses  from  0  to  102.3  (decimal)  or  000  to  .3FF  (hexadecimal) 
correspond  to  acoustic  storage  cells.  The  addresses  from  1024  to 
4095  (decimal)  or  400  to  FFF  (hexadecimal)  correspond  to  mag- 
netic dnun  storage  blocks.  In  certain  operations,  however,  the 
addresses  0  to  15  (decimal)  or  0  to  F  (hexadecimal)  represent 
input-output  stations  rather  than  storage  locations. 

These  twelve-binary-digit  groups  will  in  some  cases  be  modified 
by  the  machine  in  order  to  yield  a  final  twelve-binarv-digit  address. 
The  method  of  processing  will  depend  on  the  values  of  the  instnic- 
tion modification  digits,  .\fter  modification,  the  final  result  will 
then  be  interpreted  by  the  control  unit  as  a  machine  address. 

In  some  instnictions,  namely  those  that  perform  change  of 
control  operations,  which  involve  cycling  and  counting  rather  than 
simple  arithmetic  operations  on  numbers,  the  a  and  ft  positions 
in  an  instniction  are  not  considered  as  addresses.  In  those  cases, 
they  are  used  instead  as  counters  or  tallies.  In  other  instructions, 
which  do  not  require  three  addresses,  but  onlv  one  or  two,  the 
fl  position  is  not  considered  as  an  address.  In  these  cases,  the 
oddness  or  evenness  of  the  /3  address  is  used  to  differentiate  be- 
tween two  operations  having  the  same  operation  code  digits.  That 
is,  the  parity  of  binary  digit  P22  is  used  as  an  extra  function 
designator. 

Instruction  modification  digits 

The  four  binary  digits  P9-P6  are  used  as  instruction  modification 
or  relative  addressing  digits.  Their  normal  fimction  is  relatively 
simple;  nevertheless,  the  possible  exceptions  to  the  general  rule 
can  make  their  behavior  complicated.  These  four  digits  are  labeled 


210  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  |  Processors  with  greater  than  1  address  per  instruction 


the  a,  b,  c,  and  d  digits.  Ordinarily  the  a  digit  is  associated  with 
the  a  position,  the  b  digit  with  the  /?  position,  and  the  c  digit 
with  the  y  position  in  an  instruction. 

When  binary  digit  P22  (or  the  P  position)  is  used  in  an  instruc- 
tion to  represent  extra  operation  information,  the  instruction 
modification  digit  b  is  ignored.  In  the  case  of  input  and  output 
instructions,  when  the  various  address  positions  represent  machine 
address  locations  on  the  drum,  input-output  stations,  or  block 
lengths,  and  modification  of  these  addresses  is  not  desired  in  any 
case,  the  corresponding  relative  addressing  digits  are  ignored. 

The  purpose  of  the  instruction  modification  digits  is  to  tell  the 
machine  whether  or  not  to  modify  the  twelve  binary  digits  making 
up  the  corresponding  address  position  in  an  instruction  by  addition 
of  the  contents  of  one  or  the  other  of  two  counters.  In  the  normal 
case,  if  the  a,  b,  or  c  digit  is  a  zero,  the  twelve  binary  digits  in 
the  corresponding  position  are  interpreted,  unchanged,  as  the 
binary  representation  of  the  machine  address  of  the  number  word 
to  be  processed  by  the  instruction. 

If  one  or  more  of  the  a,  b,  or  c  digits  is  a  one,  the  contents 
of  one  of  two  auxiliary  address  counters  is  added  to  the  corre- 
sponding twelve  binary  digits  to  yield  a  final  address  usually  differ- 
ent from  that  given  by  the  original  twelve-digit  portion  of  the 
instruction  word.  The  addresses  are  then  said  to  be  relative  to  the 
counter. 

The  two  counters  involved  in  the  address  modification  feature 
of  the  MIDAC  are  known  as  the  instruction  counter  and  the  base 
counter.  In  the  normal  ca.se,  if  the  fourth  instruction  modification 
or  d  digit  is  a  zero,  the  contents  of  the  instniction  counter  will 
be  added  to  the  contents  of  the  various  twelve-digit  addresses 
(dependent  on  the  values  of  the  a,  b,  and  c  digits)  before  further 
processing  of  the  instruction.  If  the  a  digit  is  one  and  the  d  digit 
zero,  the  contents  of  the  instruction  counter  will  be  added  to  the 
a  address;  similarly  for  b  and  d  digits  and     address,  etc. 

If  the  d  digit  is  a  one,  the  contents  of  the  base  counter  will 
be  normally  added  to  the  contents  of  the  twelve  digits  in  the  a, 
/?,  and  Y  positions  (again  dependent  on  the  values  of  the  a,  b,  and 
c  digits),  before  further  processing  of  the  results.  If  the  a  digit  is 
one  and  the  d  digit  one,  the  contents  of  the  base  counter  will  be 
added  to  the  a  address,  etc. 

The  effect  of  the  instruction  modification  digits  may  be  simi- 
marized  as  follows: 

The  contents  of  the  two  counters  will  be  designated  by  C^, 

(d  =  0,  1). 

C,j  —  contents  of  the  instruction  counter 
C,  =  contents  of  the  base  counter 


Then  the  modified  addresses  «',  /?',  and  y'  are  related  to  the  a, 
P,  and  7  addresses  appearing  in  the  instruction  by  the  following: 

a'  =  a  +  aCrf       P'  =  P  +  hC^       y'  =  7  +  cC<, 

(a,  b,  c,  d  =  0,  1) 

In  certain  instructions  addresses  relative  to  one  of  the  two 
counters  may  be  prohibited.  Thus,  if  in  a  particular  instruction 
a  may  be  relative  only  to  the  instruction  counter,  then  for  that 
instruction 

a'  =  a  +  aCg 

no  matter  whether  the  d  digit  is  a  0  or  a  1. 

The  notation  («'),  (/?'),  or  (y')  is  used  to  indicate  the  word  stored 
in  the  location  whose  address  is  a',  P\  or  y'- 

Instruction  counter 

The  instruction  counter  is  a  twelve-binary  digit  (modulo  4096) 
counter  which  contains  the  binary  representation  of  the  address 
of  the  instruction  which  the  control  unit  is  processing  or  is  about 
to  process.  In  normal  operation  when  no  change  of  control  opera- 
tion is  being  processed,  the  contents  of  the  instruction  counter 
is  increased  by  one  at  the  completion  of  each  instruction.  Thus, 
normally  the  next  instruction  to  be  processed  is  stored  in  the 
acoustic  storage  cell  immediately  following  the  cell  which  contains 
the  present  instruction. 

A  change  of  control  operation  is  one  which  selects  a  next  in- 
struction not  stored  in  sequence  in  the  acoustic  storage.  That  is, 
at  the  completion  of  such  instructions  the  contents  of  the  instruc- 
tion counter  is  not  increased  by  one,  but  instead  is  changed  en- 
tirely. 

Base  counter 

The  base  counter  is  a  second  twelve-binary-digit  counter  (modulo 
4096),  physically  identical  to  the  instruction  counter,  which  con- 
tains the  binary  representation  of  a  base  number  or  tally.  Unlike 
the  instruction  counter,  however,  the  base  counter  does  not  se- 
quence automatically,  but  remains  unchanged  until  a  change  of 
ba.se  instruction  is  processed.  This  counter  serves  two  primary 
purposes,  dependent  on  the  usage  to  which  it  is  put: 

1  It  may  contain  the  address  of  the  initial  word  in  a  group, 
thus  serving  as  a  base  address  to  which  integers  representing 
the  relative  position  of  a  given  word  in  the  group  of  words 
may  be  added  by  using  the  address  modification  digits. 


Chapter  14  [  Instruction  logic  of  the  MIDAC  211 


2  It  may  contain  a  counter  or  tally  which  can  be  increased 
by  a  base  instniction.  This  instniction  makes  use  of  the 
address  modification  digits  to  change  the  counter  so  as  to 
count  the  number  of  traversals  of  a  particular  cycle  of 
instnicfions. 

Instruction  types 

Instnictions  used  in  MIDAC  can  be  divided  into  three  categories: 
change  of  information,  change  of  control,  and  transfer  of  informa- 
tion. The  first  category  can  be  further  subdivided  into  arithmetic 
and  logical  instructions.  In  the  arithmetic  instructions  are  included 
addition,  subtraction,  division,  various  forms  of  multiplication; 
power  extraction,  number  shifting;  and  number  conversion  instruc- 
tions. The  sole  logical  instruction  is  extract,  which  modifies  infor- 
mation in  a  nonarithnietic  fashion. 

The  transfer  of  information  or  data  transfer  instructions  include 
transfers  of  individual  words  or  blocks  of  words  into  and  out  of 
the  acoustic  storage  and  drum  and  magnetic  tape  control. 

The  possible  change  of  control  instructions  includes  two  com- 
parisons that  provide  different  future  sequences  dependent  on  the 
differences  of  two  numbers.  In  the  compare  numbers  or  algebraic 
comparison,  the  difference  is  an  algebraic,  signed  one.  In  the 
compare  magnitudes  or  absolute  comparison,  the  difference  is  one 
between  absolute  values.  Two  other  instructions,  file  and  base, 
perform  other  tasks  beside  transferring  control.  The  file  instruction 
transfers  control  unconditionally.  The  file  instruction  files  or  stores 
the  contents  of  the  base  or  instruction  counter  in  a  specific  address 
position  of  a  particular  word  in  the  storage.  The  base  or  tally 
instniction  provides  a  method  for  referring  addresses  automatically 
relative  to  the  address  given  by  the  base  counter,  irrespective  of 
its  contents.  The  base  instruction  also  gives  a  conditional  transfer 
of  control. 

The  nineteen  MIDAC  instructions  can  be  described  fimction- 
ally  as  follows: 

Change  of  information 

1  Add.  (a'}  +  (/?')  is  placed  in  y'.  Result  must  be  less  than 
1  in  absolute  value. 

2  Subtract,  (a')  —  (/?')  is  placed  in  y'.  Result  must  be  less 
than  1  in  absolute  value. 

3  Multiply,  Low  Order.  The  least  significant  44  binary  digits 
of  {a')  X  (/?')  are  placed  in  y'. 

4  Multiply,  High  Order.  The  most  significant  44  binary  digits 
of  (a')  X  ifl')  are  placed  in  y'. 


5  Multiply,  Rounded.  The  most  significant  44  binary  digits 
of  (a')  X  (/?')  ±  1  •  2-^5  placed  in  y'.  The  1  •  2"'^  is 
added  if  (a')  X  (/?')  is  positive,  and  subtracted  if  (a')  X  (/S') 
is  negative. 

6  Divide.  The  most  significant  44  binarv  digits  of  (P')/(a') 
are  placed  in  y'.  (Note  the  inversion  of  order  of  a  and  ji.) 
Result  must  be  less  than  1  in  absolute  value. 

7  Power  Extract.  The  number  n  •  2"*^  is  placed  in  y'  where 
n  is  the  number  of  binary  O's  to  the  left  of  the  most  signifi- 
cant binary  1  in  (a').  The  b  digit  is  ignored;  P  may  be  any 
even  number.  If  (a')  is  all  zeros,  zero  is  placed  in  y'. 

8  Shift  Number.  The  44  binary  digits  immediately  to  the 
right  of  the  radix  point  in  {a')  •  2""'-"  are  placed  in  y'. 
The  result,  in  y',  is  the  equivalent  of  shifting  (a)  n  places, 
where  n  •  2~^^  =  (/?')  and  n  positive  indicates  a  shift  left, 
n  negative  a  shift  right.  If  |n|  ^  44,  zero  is  placed  in  y'. 

9  Extract  or  Logical  Transfer.  Those  binary  digits  in  (y'), 
including  the  sign  digit,  whose  positions  correspond  to  I's 
in  (/i')  are  replaced  bv  the  digits  in  the  corresponding 
positions  of  (a'). 

10  Decimal  to  Binary  Conversion.  This  operation  may  be 
interpreted  in  two  ways:  (a)  (a')  is  considered  as  a  binary- 
coded-decimal  integer  times  2~**.  It  is  converted  to  the 
equivalent  binary  integer  times  2"^"  and  the  result  is 
placed  in  y',  or  (h)  (a')  is  considered  as  a  binary-coded- 
decimal  fraction,  D.  It  is  converted  into  an  intermediate 
binary  fraction,  Bj,  such  that  B,  =  D  X  10"  X  2  "  and 
the  result  placed  in  y'.  To  obtain  B,  the  true  binary  equiv- 
alent of  D,  Bj  must  be  multiplied  by  (IQ-i^  X  2^').  How- 
ever, since  this  factor  is  greater  than  1  and  therefore  can- 
not be  represented  in  the  machine,  two  operations  must 
be  performed.  For  example, 

B,  X  (10-11  X       -  1)  =  Bj 
B  =  B,  -h  B^ 

Here  the  b  digit  is  ignored,  and  P  may  be  any  even  nimiber. 

11  Binary-to-Decimal  Conversion,  (a'),  considered  as  a  binary 
fraction,  is  converted  into  the  equivalent  eleven-digit  bi- 
narv-coded-decimal  fraction.  The  result  is  placed  in  y'.  The 
b  digit  is  ignored,  and  ft  may  be  any  odd  number. 

Change  of  control 

12  Compare  Numbers,  y  can  be  relative  only  to  the  instruc- 
tion counter.  If  (a')  ^  [ft'),  the  contents  of  the  instruction 
counter  are  increased  by  one  as  is  normally  done  at  the 
end  of  each  instruction.  If  (a')  <  (/?'),  the  contents  of  the 
instruction  counter  are  set  to  y'. 


212  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  I  Processors  with  greater  than  1  address  per  instruction 


13  Compare  Magnitudes,  y  can  be  relative  only  to  the  instruc- 
tion counter.  If  |  («')  |  S  |  (/8')  | ,  the  contents  of  the  instruc- 
tion counter  are  increased  by  one  as  is  normally  done  at 
the  end  of  each  instruction.  If  |  (a')  |  <  |  (/?')  | ,  the  contents 
of  the  instmction  counter  is  set  to  y'. 

14  Base  or  Tally.  The  d  digit  is  ignored,  a  and  /S  may  be 

relative  only  to  the  base  counter,  y  only  to  the  instmction 
counter.  If  a'  ^  /?',  the  contents  of  the  base  counter  are 
set  to  zero  and  the  contents  of  the  instruction  counter 
increased  by  one  as  usual.  If  a'  <  /?',  the  contents  of  the 
base  counter  are  set  to  a'  and  the  contents  of  the  instruc- 
tion counter  to  y'.  {Note.  The  comparisons  made  here  are 
of  addresses  themselves,  not  their  contents.) 

15  File.  P  may  be  any  odd  number,  a  and  y  may  be  relative 
only  to  the  instruction  counter. 

If  d  =  0,  the  contents  of  the  instmction  counter  in- 
creased by  one  is  placed  in  the  y  position  of  («'),  and  the 
instmction  counter  is  set  to  y'. 

If  d  =  1,  the  contents  of  the  base  counter  is  placed  in 
the  a  position  of  (a'),  and  the  instmction  counter  is  set 
to  y'.  In  addition,  if  b  =  1,  the  contents  of  the  base  counter 
is  set  to  zero:  if  b  =  0,  the  contents  of  the  base  counter 
is  not  changed. 

Transfer  of  information 

16  Read  In.  The  a  digit  must  be  0;  the  b  digit  is  ignored. 
If  fi  is  in  the  range  0  to  7  (decimal)  or  000  to  007  (hexadeci- 
mal) a  words  are  read  into  the  acoustic  storage  from  in- 
put-output station  fi.  The  first  word  read  in  is  placed  in 
y',  the  second  in  y'  -|-  1,  etc.  If  /?  is  in  the  range  1024  to 
1791  decimal  (400  to  6FF  hexadecimal),  a  words  are  read 
into  the  acoustic  storage  from  the  dmm  starting  with  the 
first  word  in  the  dmm  block  whose  address  is  /?.  The  first 
word  is  placed  in  y',  the  second  in  y'      1,  etc. 

17  Read  Out.  The  a  digit  must  be  0,  the  e  digit  is  ignored. 
Starting  with  (fi'}.  read  out  a  consecutive  words  from  the 
acoustic  storage  to  input-output  station  y,  if  y  is  in  the 
range  0  to  7  decimal  (000  to  007  hexadecimal),  or  to  the 
dmm  starting  at  the  beginning  of  the  dmm  block  whose 
address  is  y,  if  y  is  in  the  range  1024  to  1791  decimal  (400 
to  6FF  hexadecimal). 


16  Alphanumeric  Read  In.  The  a  digit  must  be  1;  the  b  digit 
is  ignored.  If  /i  is  in  the  range  0  to  7  (decimal)  or  000  to 
007  (hexadecimal)  a  characters  are  read  into  the  acoustic 
storage  from  input-output  station  fi.  The  first  character 
read  in  is  placed  in  y',  the  second  in  y'  +  1,  etc.  Each 
character  occupies  the  six  most  significant  digit  positions 
of  the  register  into  which  it  is  read;  the  other  positions 
are  set  to  zero.  This  operation  may  not  be  used  to  read 
words  from  the  dmm  into  the  acoustic  storage. 

17  Alphanumeric  Read  Out.  The  a  digit  must  be  1;  the  c  digit 
is  ignored.  Starting  with  (/?'),  read  out  a  consecutive  char- 
acters from  the  acoustic  storage  to  input-output  station 
y;  y  must  be  in  the  range  0  to  7  (decimal)  or  000  to  007 
(hexadecimal).  This  operation  mav  not  be  used  to  read 
words  from  the  acoustic  storage  onto  the  drum. 

18  Move  Tape  Forward,  (a,  b,  c  and  d  digits  are  ignored.)  f3 
may  be  any  even  number;  y  must  be  in  the  range  0  to  15 
decimal  (000  to  OOF  hexadecimal).  The  magnetic  tape  at 
input-output  station  y  is  moved  forward  n  blocks  where 

°  =  [^)-' 

that  is,  one  plus  the  integral  part  of  a  —  '/g,  or  the  number 
of  blocks  that  include  oc  words. 

19  Move  Tape  Backward,  (a,  b,  c,  and  d  digits  are  ignored.) 
/?  may  be  any  odd  number;  y  must  be  in  the  range  0  to 
15  decimal  (000  to  OOF  hexadecimal).  The  magnetic  tape 
at  input-output  station  y  is  moved  backward  n  blocks 
where 

that  is,  one  plus  the  integral  part  of  a  —  or  the  number 
of  blocks  that  include  a  words. 

References 

CarrJ.56.  SEAC  computer  references:  .■\insE52;  .-VlexS.?!;  ElboR53:  GreeS52, 
5.3;  HaueR52;  PikeJ.52;  SerrR62;  ShupP53;  SlutR51.  DYSEAC  computer 
references:  Lein,\54. 


Chapter  15 


Instruction  logic  of  the 
Soviet  Strela  (Arrow)^ 

John  W.  Can  III 

A  typical  general  purpose  digital  computer  using  three-address 
instmction  logic  is  the  Strela  (Arrow)  constructed  in  quantity 
under  the  leadership  of  lu.  la.  Basilevvskii  of  the  Soviet  Academy 
of  Sciences,  and  described  in  detail  bv  Kitov  [1956].  This  com- 
puter uses  a  (.35,  6,  0)-  binary  floating  point  number  system. 
Its  instruction  word,  of  43  digits,  contains  a  si.x-digit  operation 
code,  and  three  I2-digit  addresses,  with  one  breakpoint  bit.  In 
octal  notation,  two  digits  represent  the  operation,  four  each  the 
addresses,  and  one  bit  the  breakpoint.  This  machine  operates  with 
up  to  2048  words  of  high-speed  cathode  ray  tube  storage. 

Input-output  is  ordinarily  via  punched  cards  and  punched 
paper  tape.  .\  "standard  program  library"  is  attached  to  the  com- 
puter as  well  as  magnetic  tape  units  (termed  "external  accumula- 
tors" below).  Note.  This  computer  is  different  from  both  the  BESM 
described  by  Lebedev  [19.56]  and  the  Ural  reported  by  Basilewskii 
[1957].  Apparently,  it  is  somewhat  lower  in  performance  than 
BESM. 

Since  all  arithmetic  is  ordinarily  in  floating  point,  "special 
instnictions"  perform  fi.xed  point  computations  for  instmction 
modifications. 

Ordinarily  instnictions  are  written  in  an  octal  notation,  but 
external  to  the  machine  operation  symbols  are  written  in  a 
mnemonic  code.  The  two-digit  numerals  are  the  octal  instruction 
equivalent. 

Arithmetic  and  logical  instructions 

01.  +  a  f{  y.  Algebraic  addition  of  (tt)  to  (/?)  with  result 
in  y. 

02.  -t-  J  a  ji  y.  Special  addition,  used  for  increasing  ad- 
dresses of  instnictions.  The  command  (a)  or  (/3)  is  added  to  the 
number  (/?)  or  (a)  and  the  result  sent  to  the  cell  with  address  y. 

'In  E.  M.  Grabbe.  S.  Ramo,  and  D.  E.  Wooldridge  (eds.),  "Handbook  of 
.\utomation.  Computation,  and  Control,"  vol.  2,  chap.  2,  pp.  111-115, 
John  Wiley  &  Sons,  Inc.,  New  York,  1959. 

-Carr's  triplet  notation  for:  fractional  significant  digits,  digits  in  exponent, 
and  digits  to  left  of  radix  point. 


As  a  rule,  the  address  of  the  instmction  being  changed  corresponds 
to  the  address  y. 

03.  —  a  P  y.  Subtraction  with  signed  numbers.  From 
the  number  (a)  is  subtracted  the  number  (/i)  and  the  result  sent 
to  y. 

04.  —  J  a  /i  y.  Difference  of  the  absolute  value  of  two 
numbers  |(o)|  -  \{ft}\  =  (y). 

05.  X  a  fi  y.  Multiplication  of  two  numbers  (a)  and  (^) 
with  result  sent  to  y. 

06.  A  a  /S  y.  Logical  multiplication  of  two  numbers  in 
cells  a  and  (i.  This  instruction  is  used  for  extraction  from  a  given 
number  or  instmction  a  part  defined  bv  the  special  number  {fi). 

07.  V  a  /J  y.  Logical  addition  of  two  numbers  (q)  and 
(/?)  and  sending  the  result  to  cell  y.  This  instmction  is  used  for 
forming  numbers  and  commands  from  parts. 

10.  Sh  a  P  y.  Shift  of  the  contents  of  cell  o  by  the 
number  of  steps  equal  to  the  exponent  of  the  (/?).  If  the  exponent 
of  the  (yS)  is  positive  then  the  shift  proceeds  to  the  left,  in  the 
direction  of  increasing  value;  if  negative,  then  the  shift  is  right. 
In  addition,  the  sign  of  the  number,  which  is  shifted  out  of  the 
cell,  is  lost. 

11.  —  2  Q  ^  y.  Special  subtraction,  used  for  decreasing 
the  addresses  of  instnictions.  In  the  cell  a  is  found  the  instmction 
to  be  transformed,  and  in  cell  the  specially  selected  number. 
Ordinarily  addresses  a  and  y  are  identical. 

12.  ^  a  P  y.  Comparison  of  two  numbers  (a)  and  (/3) 
by  means  of  digital  additions  of  the  numbers  being  compared 
modulo  two.  In  the  cell  y  is  placed  a  number  possessing  ones  in 
those  digits  in  which  inequivalence  results  in  the  numbers  being 
compared. 

Control  instructions 

13.  C  a  /J  0000.  Conditional  transfer  of  control  either  to 
instmction  (a)  or  to  instmction  (Ji).  depending  on  the  results  of 
the  preceding  operation.  With  the  operations  of  addition,  sub- 
traction, and  subtraction  of  absolute  values,  it  appraises  the  sign 


213 


214  Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


Section  1  |  Processors  with  greater  than  1  address  per  instruction 


of  the  result:  for  a  positive  or  zero  result  it  transfers  control  to 
the  command  («),  for  negative  results  to  the  command  (/S). 

The  result  of  the  operation  of  multiplication  is  dependent  on 
the  relationship  to  unity.  Transfer  is  made  to  the  command  (a) 
in  the  case  where  the  result  is  greater  than  or  equal  to  one,  and 
to  command  (/?),  if  it  is  smaller  than  one. 

For  conditional  transfer  after  the  operation  of  comparison, 
transfer  to  the  instruction  (a)  is  made  in  the  case  of  equality  of 
binary  digits,  and  to  (/S)  when  there  is  any  inequivalence. 

After  the  operation  A  (logical  sequential  multiplication)  the 
conditional  transfer  command  jumps  to  the  instruction  (a)  when 
the  result  is  different  from  zero,  and  to  instmction  when  it 
is  equal  to  zero. 

A  forced  comparison  is  given  by 

C    a    a  0000 

The  third  address  in  this  command  is  not  used  and  in  its  place 
is  put  zero. 

14.  I-O  a  0000  0000.  This  instruction  is  executed  paral- 
lel with  the  code  of  the  other  operations,  and  guarantees  bringing 
into  working  position  in  good  time  the  zone  of  the  external  ac- 
cumulator (magnetic  tape  unit)  with  the  address  a. 

15.  H  0000  0000  0000.  This  instmction  executes  an  ab- 
solute halt. 

Group  transfer  instructions 

Special  instructions  for  group  transfer  serve  for  the  accomplish- 
ment of  a  transfer  of  numbers  to  and  from  the  accumulators.  In 
the  second  address  in  these  instructions  stands  an  integer,  desig- 
nating the  quantity  of  numbers  in  the  group  which  must  be  trans- 
ferred. Group  transfers  always  are  produced  in  increasing  sequence 
of  addresses  of  cells  in  the  storage. 

16.  Tj  0000  n  y.  The  instruction  Tj  guarantees  transfer 
from  a  given  input  unit  (with  punched  cards,  perforated  tape,  etc.) 
into  the  storage.  In  the  third  address  y  of  the  instruction  is  indi- 
cated the  initial  address  of  the  group  of  cells  in  the  storage  where 
numbers  are  to  be  written.  With  punched  paper  tape  or  punched 
cards  the  variables  are  written  in  sequence,  beginning  with  the 
first  line. 

17.  0000  n  y.  The  instruction  T,  guarantees  transfer 
of  a  group  of  n  numbers  from  an  input  unit  into  the  external 
accumulator  in  zone  y. 

20.  T3  a  n  y.  This  instruction  guarantees  a  line-by-line 
sequence  of  transfers  of  n  numbers  from  zone  a  of  the  external 
accumulator  into  the  cells  of  the  storage  beginning  with  the  cell 
with  address  7. 


21.  T^  a  n  0000.  This  instruction  guarantees  the  trans- 
fer to  the  input-output  unit  (to  punched  paper  tape  or  punched 
cards)  of  a  group  of  n  numbers  from  the  storage,  beginning  with 
address  a.  The  record  on  punched  paper  tape  or  punched  cards 
as  a  mle  will  begin  with  the  first  line  and  therefore  a  positive 
indication  of  the  addresses  of  the  record  is  not  required. 

22.  T5  any.  Instruction  T5  guarantees  transfer  of  a 
group  of  n  numbers  from  one  place  in  the  storage  with  initial 
address  a  into  another  place  in  the  storage  with  initial  address  y. 

23.  Tg  a  n  y.  Instruction  Tg  guarantees  transfer  of  a 
group  of  n  numbers  from  the  storage  with  initial  address  a  into 
the  external  accumulator  with  address  7. 

24.  Tj  a  n  0000.  Instruction  Tj  serves  for  transfer  of  n 
numbers  from  the  zone  of  the  external  accumulator  with  address 
a  into  the  input-output  unit. 

Instmctions  Tj  and  T^  cannot  be  performed  concurrently  with 
other  machine  operations. 

Standard  subroutine  instructions 

Certain  instructions  in  the  Strela,  although  written  as  ordinary 
instructions,  are  actually  "synthetic"  instructions  which  call  on 
a  subroutine  for  computation  of  the  function  involved.  The  amount 
of  machine  time  (number  of  basic  instruction  cycles)  for  an  itera- 
tive process  depends  on  the  required  precision  of  the  computed 
function.  The  figures  given  below  are  based  on  approximately 
ten-digit  decimal  numbers  with  desired  precision  one  in  the  tenth 
place. 

25.  D  a  /}  7.  This  standard  subroutine  serves  for  exe- 
cution of  the  operation  of  division:  The  number  (a)  is  divided  into 
the  number  (P)  and  the  quotient  is  sent  to  cell  y. 

The  actual  operation  of  division  is  executed  in  two  steps:  the 
initial  obtaining  of  the  value  of  the  inverse  of  the  divisor,  by  which 
the  dividend  is  then  multiplied.  The  computation  of  the  inverse 
is  given  by  the  usual  Newton  formula,  originally  used  with  the 
EDSAC  [Wilkes  et  al.,  1952]. 

=  !/n(2  -  t/„-v) 

For  X  =  d  ■  2''.  where  <  d  <  1,  the  first  approximation  is  taken 
as  2'^.  The  standard  subroutine  takes  8  to  10  instructions  and  can 
be  executed  in  18-20  machine  cycles  (execution  time  for  one 
typical  command). 

26.  V~  «  0000  y.  This  instruction  guarantees  obtaining 
the  value  Vx  from  the  value  x  =  (a)  and  sending  the  result  to 
cell  7.  Initially  1/v'^  is  computed  by  the  iteration  formula 


Chapter  15  |  Instruction  logic  of  the  Soviet  Strela  (Arrow)  215 


where  the  first  approximation  is  taken  as 

,,    _  olp/21 

the  bracket  indicating  "integral  part  of."  After  this  the  result  is 
multiplied  by  x  to  obtain  Vx-  This  standard  subroutine  contains  14 
instructions  and  is  executed  in  40  cycles. 

27.  e"  a  0000  y.  This  instruction  guarantees  formation 
of  for  the  value  x  =  (a)  and  sending  the  result  to  cell  y.  The 
computation  is  produced  by  means  of  expansion  of  in  a  power 
series.  The  standard  subroutine  contains  20  instructions  and  is 
executed  in  40  cycles. 

30.  In  X  a  0000  y.  This  instniction  guarantees  forma- 
tion of  the  fvuKtion  In  .v  for  the  value  .v  —  (a)  and  sending  the  re- 
sult to  location  y.  Computation  is  produced  hy  expansion  of  In  .v  in 
series.  The  subprogram  contains  15  instructions  and  is  executed 
in  60  cycles. 

31.  sin  X  a  0000  y.  This  instruction  guarantees  execu- 
tion of  the  fimction  sin  x  and  sending  the  result  to  location  y.  The 
computation  is  produced  in  two  steps:  initially  the  value  of  the 
argiunent  is  translated  into  the  first  quadrant,  then  the  value  of 
the  function  is  obtained  by  a  series  expansion.  The  subroutine 
contains  18  instructions  and  is  executed  in  25  cycles. 


32.  DB  any.  This  instniction  performs  conversion  of 
a  group  of  n  numbers,  stored  in  locations  q,  cv  -|-  1.  .  .  .  from  bi- 
nary-coded decimal  into  binary  and  sending  of  the  result  to  loca- 
tions y,  y  -(-  1,  .  .  .  .  The  subroutine  contains  14  instnictions  and 
is  executed  in  50  cycles  (for  each  number). 

33.  BD  any.  This  instruction  performs  the  conversion 
of  a  group  of  n  numbers  stored  in  locations  «,  a  -|-  1, .  .  .  from  the 
binary  system  into  binarv-coded  decimal  and  sends  them  to  loca- 
tions y,  y  +  1,  .  .  .  .  The  subroutine  contains  onlv  30  instnictions 
and  is  executed  with  100  cycles  (for  each  number). 

34.  MS  any.  This  is  an  instruction  for  storage  sum- 
ming. This  instruction  produces  the  formal  addition  of  numbers, 
stored  in  locations  beginning  with  address  a,  and  the  result  is  sent 
to  location  y.  Numbers  and  instnictions  are  added  in  fixed  point. 
This  sum  may  be  compared  with  a  previous  sum  for  control  of 
storage  accuracy. 

References 

BasiloT;  KitoA56;  LebeS.56;  \VilkM52. 


Section  2 

Processors  constrained  by  a  cyclic, 
primary  memory 

These  processors  use  one  extra  (the  +  1)  address  to  specify 
the  address  of  the  next  instruction.  Obviously  this  address  is 
used  to  allow  complete  freedom  in  the  location  of  both  operands 
and  next  instructions  in  an  optimum  manner.  The  IBM  650, 
a  1  +  1  address  computer,  is  the  most  straightforward  to  un- 
derstand. ACE  and  ZEBRA  have  subtle  microcoded  instructions 
to  achieve  powerful  Instruction  sets.  The  LGP-30  and  LGP-21 
have  a  simple  1  address  instruction  format;  they  interlace  sev- 
eral logical  addresses  between  the  physical  addresses  to  help 
with  the  optimum  location  of  operands. 

The  Olivetti  Underwood  Programma  101  desk  calculator 

The  Programma  101  is  a  desk  calculator  computer  implemented 
with  a  cyclic  Mp.  The  cyclic  memory  is  not  apparent  from  the 
user's  viewpoint  because  the  response  is  adequate  (less  than 
0.1  sec  for  simple  arithmetic  operations).  The  Programma  101 
is  discussed  in  Part  3,  Sec.  4,  page  235. 

ZEBRA,  a  simple  binary  computer 

The  ZEBRA  is  presented  in  Chap.  12  and  is  discussed  in  Part 
3,  Sec.  1,  page  190. 

The  LGP-30  and  LGP-21 

The  LGP-30  (Chap.  16)  is  a  first-generation,  31-bit  computer 
with  an  Mp. cyclic  and  a  very  simple  ISP.  The  computer  appears 
to  be  characteristic  of  small-scale  drum  computers  in  the  first 
generation.  We  think  of  this  class  of  computer  as  having  very 
little  power  when  compared,  for  example,  with  the  IBM  701. 
However,  the  power  is  mostly  related  to  the  drum-based  tech- 
nology, with  0.26  ^  16.66  millisecond  access  times. 


The  Pilot  ACE 

The  NPL  Pilot  ACE  is  presented  in  Chap.  11.  Its  relationship  in 
the  computer  space  is  discussed  in  Part  3,  Sec.  1,  page  190. 

The  UNIVAC  system 

The  UNIVAC  I  is  described  in  Chap.  8.  A  discussion  is  given 
in  Part  2,  Sec.  1,  page  91. 

The  design  philosophy  of  Pegasus, 
a  quantity-production  computer 

The  Pegasus  cyclic  memory,  general  register  computer  (Chap. 
9)  is  discussed  in  Part  2,  Sec.  2,  page  170. 

IBM  650  instruction  logic 

The  IBM  650  has  a  1  -i-  1  address  format  and  a  very  complete 
instruction  set.  Because  of  the  long  word  length  (10  decimal 
digits)  we  would  consider  it  to  have  general  utility.  The  650's 
high  performance  is  achieved  by  using  a  fast  drum  (6  millisec- 
onds/revolution). The  characteristics  given  in  Chap.  17  present 
the  machine  as  it  was  first  introduced  in  1954.  Later  versions 
provided  options  for  floating  point  arithmetic  and  index  regis- 
ters. A  96-word  core  buffer  was  also  added  for  disk  and  mag- 
netic-tape buffering.  The  machine  structure  is  a  simple  1  Pc 
without  concurrent  processing  and  input/output  transfer  abil- 
ity. Although  the  650  has  a  large  word,  it  initially  processed  only 
fixed  point  integers. 

NOVA:  a  list-oriented  computer 

The  NOVA  (Chap.  26)  is  a  specialized  computer  for  processing 
array  data.  It  is  discussed  in  Part  4,  Sec.  2,  page  315. 


216 


Chapter  16 

The  LGP-30  and  LGP-21 


C  'LGP-30;  technology:   (113  vacuum  tubes),   (1350  diodes); 
power:   1500  watts;  weight:  ROO  pounds;  number  produced: 
320  ~  U^O;  t. delivery:  September  1956;  descendant:  'LGP-21; 
Pc(l  address;  1  instruct ion/w;  data:  w.bv.i.fr;  Hps  (~  2  w) ; 

operations:    (+,-,«,/ ,A  .x  2)) 
Mp(drum;   t. cycle:   260  j s/w;   t. access:    (.260  ~  16.6)  ms : 
i.rate;  2.3'*  eor.tiguous  addresses:    A096  w;  (31,1 

space )  b/w) 
T(Flexowri ter ,  paper  tape) 

C  'LGP-21:  technology:   (1160  transistors),   (375  diodes);  power: 
300  watts;  weight:  90  pounds;  number  produced:  —  150; 
t. delivery:  December  1962: 

Mp(fixed  head  disk;  cyclic;  t. cycle:  liOOts/w;  t. access: 
(0  ~  52)  ms:  i.rate:  7.26  ms/vi  aontiguous  addresses: 
"1096  w;  (31  ,1  space)  b/w) 

T(«l:32;  Flexowriter,  paper  tape,  analog,  CRT,  card) 


The  LGP-30  is  a  small  computer  with  an  Mp.drum.  It  is  distinct 
from  the  first  (and  succeeding)  generation  computers  using 
Mp.random^access  and  can  be  described  by  using  the  PMS  dia- 
gram in  Fig.  1.  The  LGP-21,  a  direct  descendant  of  the  LGP-30, 
having  the  same  ISP,  is  also  described  by  Fig.  1. 

Since  there  is  only  one  address/instniction,  a  method  is  needed 
for  the  optimal  allocation  of  operands.  Otherwise,  each  instruction 
might  have  to  wait  a  complete  dnmi  (or  disk)  revolution  each  time 
a  data  reference  is  made.  The  LGP-30  provides  for  operand- 
location  optimization  by  interlacing  the  logical  addresses  on  the 
drum  so  that  two  adjacent  addresses  (e.g.,  00  and  01)  are  separated 
bv  nine  physical  locations.'  These  spaces  allow  for  operands  to 
be  located  next  to  the  instructions  which  use  them.  There  are  64 
tracks,  each  with  64  words  (sectors).  Each  word  is  accessed  by 
a  track  address  of  6  bits  and  a  word  address  of  6  bits.  The  sequence 
of  words  (sectors)  within  a  track  is  00,  57,  .50,  43,  .36,  29,  22,  15, 
08,  01,  58,  51,  44,  37,  .  .  .  ,  06,  63,  56,  49,  42,  .35,  28,  21,  14, 
07,  00.  The  time  between  two  adjacent  physical  words  is  appro.xi- 
mately  0.260  millisecond,  and  the  time  between  two  adjacent 
addresses  is  9  X  0.260  or  2. .340  milliseconds.  The  actual  maximum 
t. access  is  16.66  ms.- 

Half  of  the  instruction  (15  bits)  is  unused.  It  could  be  used  for 
extra  instructions,  indexing,  indirect  addressing,  or  a  second  (-1- 1) 
address  to  locate  the  next  instruction,  all  of  which  increase  the 
preformance. 

'The  LGP-21  has  a  space  of  IS  words. 

-The  later  LGP-21  appears  to  have  a  low  er  performance  than  the  LGP-30 
by  about  a  factor  of  3. 


Fig.  1.  LGP-30  and  LGP-21  PMS  diagrams. 


The  ISP,  given  in  Appendix  1  of  this  chapter,  is  about  the  most 
straightforward  in  the  book.  There  are  only  16  instructions,  and 
the  program  state  is  less  than  two  words.  Although  the  perform- 
ance is  limited  because  of  an  Mp.cyclic^access,  an  Mp.ran- 
doni^access  would  serve  to  make  the  ISP  fairlv  similar  to  other 
faster  computers,  e.g.,  an  IBM  701. 


217 


218  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  2     Processors  constrained  by  a  cyclic,  primary  memory 


APPENDIX  1    LGP-30  AND  LGP-21  ISP  DESCRIPTION 


Appendix  1 
LGP-30  and  LGP-21   ISP  Description 


pe  State 
A<D;30> 
C<n8:23,2'l:29> 
Ov 
Run 

Pa  Console  State 
BP<A,8,16,32> 
TC 

ffp  State 

M[0:775|][0:77g]<0:30> 


Accumulator 

Program  Counter  register 

Overflow,  LGP-21  only  on  LGP-SO  machine  stops  if  an  overflow 


Break  Point  switches 
Transfer  Control  switch 


przmary  memorji; 


„J2 


w;  track  and  sector  (word) 


K  State 

The  following  Input  Output  devices  do  not  have  synchronization  description  variables.    LGP-21  only.    LGP-30  has  a  Flexowriter. 
Inputjevice  [0:3'  ]<1  :6> 


stop  code 
Output  Jev ice  [0:  31  ]<1 :6> 

Instruction  Format 
i<0:30> 


opO:3> 
t<D:5> 
t'<SS;h> 


=  i<12:15> 
=  i<l8:23> 
=  t<1  :5> 
=  \<2k:23> 


skip  condition   :=   C(t<D:3>  A  -i  BP)  0) 

Instruction  Interpretation  Process 
Kun  ^(i   <-M[C];   C  +  1;  next 

Inst  rue  t  ion  ^xecut  ion) 

Instruction  Set  and  Instruction  Execution  Process 
Instruction^^pxecution  :=  ( 
Z  (:=  op  =  0)  ( 

(t  =  OOOOOJ)   -»  (Run  ^0)  ; 
sl<ip  condition  ->  (C  «- C  +  t); 
i<D>^(Ov  ^(Ov  ^0;  C  +  1))); 

B  (:=  op  =  1)       (A  <-M[t][5]); 
Y   (l=  op  =  2)   -»  (M[t  ][s  ]<18:29>f-A<18:29>)  ; 
R  (;=  op  =  3)  ^  (M[t][s  ]<18:29>        +  1): 
I    (:=  op  =  M   ^  ( 

-1  i<D>  A  (t=62)   ^  (A 
i<D>  A  (t=62)   -1  (A 
i<C>  A  (ti'62)   -  (input„6^bit)  : 
i<0>  A  (15^62)   ->  (inputs lljiit)): 


condition  signifying  input  device  has  read  a  special  code 


instruction 

operation  code 

track  select  bit  on  Mp 

input-output  select^  LGP-21  only 

sector  select  bit  of  ^ 


fetch 
execute 


sense  BP  and  transfer 

sense  overflow  and  transfer 

bring  from  memory 

store  address 

set  return  address 

shifts,  and  input 


-A  X  2°  [logical  ))  ; 
-A  X  2''   [logical  ]) : 


Chapter  16  [  The  LGP-30  and  LGP  21  219 


APPENDIX  1    LGP-30  AND  LGP-21  ISP  DESCRIPTION  (Continued) 


6  . 

i  npu  t  ,_£^b  it  :=  (A»— A)(  2    {1  og  i  C3 1  ]  ;  next 

zttput  processss 

A<25'3Ct'       1  npu  tudev  i  ceC  t  '  ]  :  next 

{ — ^A<0^  V  stop  code )  — *  i  npu t^6Ljb  i  t ) 

i  npu  t  ^^ijb  it   :=   (A*—  Ax  2     [I  on  i  ca  I  ]  ;  next 

A<27:3Q>  >-  Input  devi  ce[  t  '  ]<1  :  lO  :  next 

(-tA<0>  V  stop  code)  -»  inputij'lLjbit) 

D  ( 

=  op  =  5)  ->  (Ov,A  ^  round  (A  /  M[t][s])); 

divide 

N  ( 

=  op  =  6)       (A  ^  A  X  M[t][s]  (s.inteaerl); 

mult-ivty,  save  vight 

M  ( 

=  op  =  7)       (A  <-  A  X  M[t][s]  (s. fraction]): 

rmtltivly,  save  left 

P  ( 

=  op  =  lOg)  ->  ( 

-,  i<0>  ->  (Output„devIce[  t         :6>  <-  A<D:5>)  : 

vrint  6  bit 

i<0>  -.  (Output^device[  t ']<1  :6>  <- A<D;?>aiDO))  ; 

print  4  bit 

E  ( 

=  op  =  llg)  ->  (A  ^  A  A  H[  t][s])  : 

extract 

U  ( 

=  op  =  12)  ^  (C  ^  tns) ; 

unconditional  transfer 

T  ( 

=  op  >  13)  ->  (i<Q>  -*  ((A<0>  V  TC)  ^  (C  -  tOs)); 

transfer  control 

-ii<0>  -  (A<0>  ->  (C  (-tns))); 

conditional  transfer 

H  ( 

=  op  =  I^)  -»  (H[t][s]  -A); 

hold  and  store 

C  ( 

=  op  =  15)  -^  (M[t][s3  ^A;  next  A  ^0): 

clear 

A  ( 

=  op  =  16)  -.  (OvaA  t- A  +  M[t][s]); 

add 

S  ( 

=  op  =  17)       (OvdA  <-A  -  M[t][s]) 
) 

subtract 

end  Instruction^execution 

Chapter  17 

IBM  650  instruction  logic^ 


John  W.  Can  111 

The  basic  IBM  650  is  a  magnetic  drum  (10,  0, 0)-  decimal  computer 
with  one-plus-one  address  instruction  logic.  It  has  a  storage  of  1000 
or  2000  10-digit  words  (plus  sign)  with  addresses  0000-0999  or 
0000-1999.  More  extended  versions  of  the  equipment  have  built-in 
floating  point  arithmetic  and  index  accumulators,  but  the  basic 
machine  will  be  described  here.  There  are  three  arithmetic  regis- 
ters in  addition  to  the  standard  program  register  and  program 
counter.  All  information  from  the  drum  to  the  arithmetic  unit 
passes  through  a  signed  10-digit  distributor.  A  twenty-digit  ac- 
cumulator is  divided  into  a  lower  and  upper  part,  each  of  10  digits 
with  sign.  Each  of  these  is  addressable  (distributor  8001,  lower 
accumulator  8002,  and  upper  accumulator  800.3).  Each  accumula- 
tor may  be  cleared  to  zero  separately  (in  IBM  650  terminology, 
"reset").  The  entire  20-digit  register  can  be  considered  as  a  unit, 
or  each  part  separately  (but  affecting  the  other  in  case  of  carries). 
The  10-digit  instruction  is  broken  down  into  the  following  form: 


10 

9 

8 

7 

6 

5 

4 

2 

3 

1 

0 

Op. 

Code 

Data 
Address 

Next  Instruction 
Address 

Sign 

One  particular  instruction.  Table  Look-Up,  allows  automatic  table 
search  for  one  particular  element  in  a  table,  which  can  be  stored 
with  a  corresponding  functional  value.  Input-output  is  via  80-digit 
numerical  punched  cards.  An  "alphabetic  device"  allows  limited 
alphabetical  entry  on  cards.  Only  certain  10-word  groups  on  the 
magnetic  drum  are  available  for  input  and  output.  The  following 
information  is  taken  from  an  IBM  650  manual  [Type  650,  Magnetic 
Dnim  Data-Processing  Machine  Manual  of  Operations].  Much  of 
the  input-output  is  handled  via  board  wiring,  which  is  not  de- 
scribed in  detail  below.  The  two-digit  pair  represents  the  machine 
code.  The  BRD  (Branch  on  Digit)  operation  is  used  with  special 
board  wiring  to  tell  when  certain  specific  card  punches  exist. 

'In  E.  M.  Grabbe,  S.  Ramo,  and  D.  E.  Wooldridge  (eds.),  "Handbook  of 
Automation,  Computation,  and  Control,"  vol.  2,  chap.  2,  pp.  93-98, 
John  Wiley  &  Son.s,  Inc.,  New  York.  1959. 

-Carr's  triplet  notation  for:  fractional  significant  digits,  digits  in  exponent, 
and  digits  to  left  of  radix  point. 


Input-output  instructions 

70  RD  (Read).  This  operation  code  causes  the  machine  to 
read  cards  bv  a  two-step  process.  First,  the  contents  of  the  10 
words  of  read  buffer  storage  are  automatically  transferred  to  one 
of  the  20  (or  40)  possible  10-word  groups  of  read  general  storage. 
The  group  selected  is  determined  by  the  D  address  of  the  Read 
instniction.  Secondly,  a  card  is  moved  under  the  reading  bnishes, 
and  the  information  read  is  entered  into  buffer  storage  for  the  next 
Read  instruction. 

71  PCH  (Punch).  This  operation  code  causes  card  punch- 
ing in  two  steps.  First  the  contents  of  one  of  the  20  (or  40)  possible 
10-word  groups  of  punch  storage  are  transferred  to  punch  buffer 
storage.  The  group  selected  is  specified  by  the  D  address  of  the 
Punch  instruction.  Secondly,  the  card  is  punched  with  the  infor- 
mation from  buffer  storage. 

69  LD  (Load  Distributor).  This  operation  code  causes  the 
contents  of  the  D  address  location  of  the  instruction  to  be  placed 
in  the  distributor. 

24  STD  (Store  Distributor).  This  operation  code  causes  the 
contents  of  the  distributor  with  the  distributor  sign  to  be  stored 
in  the  location  specified  by  the  D  address  of  the  instruction.  The 
contents  of  the  distributor  remain  undisturbed. 

Addition  and  subtraction  instructions 

10  AU  (Add  to  Upper).  This  operation  code  causes  the 
contents  of  the  D  address  location  to  be  added  to  the  contents 
of  the  upper  half  of  the  accumulator.  The  lower  half  of  the  ac- 
cumulator will  remain  unaffected  unless  the  addition  causes  the 
sign  of  the  accumulator  to  change,  in  which  case  the  contents  of 
the  lower  half  of  the  accumulator  will  be  complemented.  Also, 
the  units  position  of  the  upper  half  of  the  accumulator  will  be 
reduced  bv  one. 

15  AL  (Add  to  Lower).  This  operation  code  causes  the 
contents  of  the  D  address  location  to  be  added  to  the  contents 
of  the  lower  half  of  the  accumulator.  The  contents  of  the  upper 
half  of  the  accumulator  could  be  affected  by  carries. 

11  SU  (Subtract  from  Upper).  This  operation  code  causes 
the  contents  of  the  D  address  location  to  be  subtracted  from  the 


220 


Chapter  17  |  IBM  650  instruction  logic  221 


contents  of  the  upper  half  of  the  accumulator.  The  contents  of 
the  lower  half  of  the  accumulator  will  remain  unaffected  unless 
the  subtraction  causes  a  change  of  sign  in  the  accumulator,  in 
which  case  the  contents  of  the  lower  half  of  the  accumulator  will 
be  complemented.  Also,  the  units  position  of  the  upper  half  of 
the  accumulator  will  be  reduced  by  one. 

16  SL  (Subtract  from  Lower).  This  operation  code  causes 
the  contents  of  the  D  address  location  to  be  subtracted  from  the 
contents  of  the  lower  half  of  the  accumulator.  The  contents  of 
the  upper  half  of  the  accumulator  could  be  affected  by  carries. 

60  RAU  (Reset  and  Add  into  Upper).  This  operation  code 
resets  the  entire  accumulator  to  plus  zero  and  adds  the  contents 
of  the  D  address  location  into  the  upper  half  of  the  accumulator. 

65  RAL  (Reset  and  Add  into  Lower).  This  operation  code 
resets  the  entire  accumulator  to  plus  zero  and  adds  the  contents 
of  the  D  address  location  into  the  lower  half  of  the  accumulator. 

61  RSU  (Reset  and  Subtract  into  Upper).  This  operation 
code  resets  the  entire  accumulator  to  plus  zero  and  subtracts  the 
contents  of  the  D  address  location  into  the  upper  half  of  the 
accumulator. 

66  RSL  (Reset  and  Subtract  into  Lower).  This  operation 
code  resets  the  entire  accumulator  to  plus  zero  and  subtracts  the 
contents  of  the  D  address  location  into  the  lower  half  of  the 
accumulator. 

Accumulator  store  instructions 

20  STL  (Store  Lower  in  Memory).  This  operation  code 
causes  the  contents  of  the  lower  half  of  the  accumulator  with  the 
accumulator  sign  to  be  stored  in  the  location  specified  by  the  D  ad- 
dress of  the  instniction.  The  contents  of  the  lower  half  of  the 
accumulator  remain  undisturbed. 

It  is  important  to  remember  that  the  D  address  for  all  store 
instructions  must  be  0000-1999.  An  8000  series  D  address  will  not 
be  accepted  as  valid  bv  the  machine  on  any  of  the  store  instruc- 
tions. 

21  STU  (Store  Upper  in  Memory).  This  operation  code 
causes  the  contents  of  the  upper  half  of  the  accumulator  with  the 
accumulator  sign  to  be  stored  in  the  location  specified  by  the 
D  address  of  the  instniction.  If  STU  is  performed  after  a  division 
operation,  and  before  another  division,  multiplication,  or  reset 
operation  takes  place,  the  contents  of  the  upper  accumulator  will 
be  stored  with  the  sign  of  the  remainder  from  the  divide  operation 
(Op-Code  14).  The  contents  of  the  upper  half  of  the  accumulator 
remain  undisturbed. 

22  STDA    (Store  Lower  Data  Address).    This  operation  code 


causes  positions  8-5  of  the  distributor  to  be  replaced  by  the  con- 
tents of  the  corresponding  positions  of  the  lower  half  of  the  ac- 
cumulator. The  modified  word  in  the  distributor  with  the  sign  of 
the  distributor  is  then  stored  in  the  location  specified  by  the 
D  address  of  the  instruction. 

23  STIA  (Store  Lower  Instruction  Address).  This  operation 
code  causes  positions  4-1  of  the  distributor  to  be  replaced  by  the 
contents  of  the  corresponding  positions  of  the  lower  half  of  the 
accumulator.  The  modified  word  in  the  distributor  with  the  sign 
of  the  distributor  is  then  stored  in  the  location  specified  bv  the 
D  address  of  the  instruction.  The  contents  of  the  lower  half  of 
the  accumulator  remain  unchanged,  and  the  sign  of  the  accumu- 
lator is  not  transferred  to  the  distributor.  The  modified  word  re- 
mains in  the  distributor  upon  completion  of  the  operation. 

Absolute  value  instructions 

17  .'V.\BL  (Add  Absolute  to  Lower).  This  operation  code 
causes  the  contents  of  the  D  address  location  to  be  added  to  the 
contents  of  the  lower  half  of  the  accumulator  as  a  positive  factor 
regardless  of  the  actual  sign.  When  the  operation  is  completed, 
the  distributor  will  contain  the  D  address  factor  with  its  actual 
sign. 

67  RAABL  (Reset  and  Add  Absolute  into  Lower).  This 
operation  code  resets  the  entire  accinnulator  to  zeros  and  adds 
the  contents  of  the  D  address  location  into  the  lower  half  of  the 
accumulator  as  a  positive  factor  regardless  of  its  actual  sign.  When 
the  operation  is  completed,  the  distributor  will  contain  the  D  ad- 
dress factor  with  its  actual  sign. 

18  SABL  (Subtract  Absolute  from  Lower).  This  operation 
code  causes  the  contents  of  the  D  address  location  to  be  subtracted 
from  the  contents  of  the  lower  half  of  the  accumulator  as  a  positive 
factor  regardless  of  the  actual  sign.  When  the  operation  is  com- 
pleted, the  distributor  will  contain  the  D  address  factor  with  its 
actual  sign. 

68  RSABL  (Reset  and  Subtract  Absolute  into  Lower).  This 
operation  code  resets  the  entire  accumulator  to  plus  zero  and 
subtracts  the  contents  of  the  D  address  location  into  the  lower 
half  of  the  accumulator  as  a  positive  factor,  regardless  of  the  actual 
sign.  When  the  operation  is  completed,  the  distributor  will  contain 
the  D  address  factor  with  its  actual  sign. 

Multiplication  and  division 

19  MULT  (Multiply).  This  operation  code  causes  the  ma- 
chine to  multiply.  A  10-digit  multiplicand  may  be  multiplied  by 


Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


Section  2  |  Processors  constrained  by  a  cyclic,  primary  memory 


a  10-digit  multiplier  to  develop  a  20-digit  product.  The  multiplier 
must  be  placed  in  the  upper  accumulator  prior  to  multiplication. 
The  location  of  the  multiplicand  is  specified  by  the  D  address  of 
the  instmction.  The  product  is  developed  in  the  accumulator 
beginning  in  the  low-order  position  of  the  lower  half  of  the  ac- 
cumulator and  extending  to  the  left  into  the  upper  half  of  the 
accumulator  as  required. 

14  DIV  (Divide).  This  operation  code  causes  the  machine 
to  divide  without  resetting  the  remainder.  A  20-digit  dividend  mav 
be  divided  by  a  10-digit  divisor  to  produce  a  10-digit  quotient. 
In  order  to  remain  within  these  limits,  the  absolute  value  of  the 
divisor  must  be  greater  than  the  absolute  value  of  that  portion  of 
the  dividend  that  is  in  the  upper  half  of  the  accumulator.  The 
entire  dividend  is  placed  in  the  20-position  accumulator.  The 
location  of  the  divisor  is  specified  bv  the  D  address  of  the  divide 
instruction. 

64  DIV  RU  (Divide  and  Reset  Upper).  This  operation 
code  causes  the  machine  to  divide  as  explained  under  operation 
code  14  (DIV).  However,  the  upper  half  of  the  accumulator  con- 
taining the  remainder  with  its  sign  is  reset  to  zeros. 

Branching  instructions  (decision  operations) 

44  BRNZU  (Branch  on  Non-Zero  in  Upper).  This  opera- 
tion code  causes  the  contents  of  the  upper  half  of  the  accumulator 
to  be  examined  for  zero.  If  the  contents  of  the  upper  half  of  the 
accumulator  is  nonzero,  the  location  of  the  next  instruction  to  be 
executed  is  specified  by  the  D  address.  If  the  contents  of  the  upper 
half  of  the  accumulator  is  zero,  the  location  of  the  ne.xt  instruction 
to  be  executed  is  specified  bv  the  I  address.  The  sign  of  the  ac- 
cumulator is  ignored. 

45  BRNZ  (Branch  on  Non-Zero).  This  operation  code 
causes  the  contents  of  the  entire  accumulator  to  be  examined  for 
zero.  If  the  contents  of  the  accumulator  is  nonzero,  the  location 
of  the  next  instruction  to  be  executed  is  specified  by  the  D  address. 
If  the  contents  of  the  accumulator  is  zero,  the  location  of  the  next 
instruction  to  be  executed  is  specified  bv  the  I  address.  The  sign 
of  the  accumulator  is  ignored. 

46  BRMIN  (Branch  on  Minus).  This  operation  code  causes 
the  sign  of  the  accumulator  to  be  examined  for  minus.  If  the  sign 
of  the  accumulator  is  minus,  the  location  of  the  next  instruction 
to  be  executed  is  specified  by  the  D  address.  If  the  sign  of  the 
accumulator  is  positive,  the  location  of  the  next  instruction  to  be 
executed  is  specified  by  the  I  address.  The  contents  of  the  accu- 
mulator are  ignored. 

47  BROV    (Branch  on  Overflow).    This  operation  code 


causes  the  overflow  circuit  to  be  examined  to  see  whether  it  has 
been  set.  If  the  overflow  circuit  is  set,  the  location  of  the  next 
instruction  to  be  executed  is  specified  by  the  D  address.  If  the 
overflow  circuit  is  not  set,  the  location  of  the  next  instruction  to 
be  executed  is  specified  bv  the  I  address. 

90-99  BRD  1-10  (Branch  on  8  in  Distributor  Position 
I-IO).  This  operation  code  examines  a  particular  digit  position 
in  the  distributor  for  the  presence  of  an  8  or  9.  Codes  91-99  test 
positions  1-9,  respectively,  of  the  test  word;  code  90  tests  position 
10.  If  an  8  is  present,  the  location  of  the  next  instmction  to  be 
executed  is  specified  by  the  D  address.  If  a  9  is  present,  the  location 
of  the  next  instruction  to  be  executed  is  specified  by  the  I  address. 
The  presence  of  other  than  an  8  or  9  will  stop  the  machine. 

Shift  instructions 

30  SRT  (Shift  Right).  This  operation  code  causes  the  con- 
tents of  the  entire  accumulator  to  be  shifted  right  the  number  of 
places  specified  by  the  units  digit  of  the  D  address  of  the  shift 
instruction.  A  maximum  shift  of  nine  positions  is  possible.  A  data 
address  with  units  digit  of  zero  will  result  in  no  .shift.  All  numbers 
shifted  off  the  right  end  of  the  accumulator  are  lost. 

31  SRD  (Shift  Round).  This  operation  causes  the  contents 
of  the  entire  accumulator  to  be  shifted  right  the  number  of  places 
specified  by  the  units  digit  of  the  D  address  of  the  instruction. 
A  5  is  added  (  —  5  if  the  accumulator  is  negative)  in  the  twenty-first 
(blind)  position  of  the  amount  in  the  accumulator.  A  data  address 
units  digit  of  zero  will  shift  10  places  right  with  rounding. 

35  SLT  (Shift  Left).  This  operation  code  causes  the  con- 
tents of  the  entire  accumulator  to  be  shifted  left  the  number  of 
places  specified  bv  the  units  digit  of  the  D  address  of  the  instmc- 
tion. .\  maximum  shift  of  nine  positions  is  possible.  A  data  address 
with  a  units  digit  of  zero  will  result  in  no  shift.  All  numbers  shifted 
off  the  left  end  of  the  accumulator  are  lost.  However,  the  overflow 
circuit  will  not  be  turned  on. 

36  SCT  (Shift  Left  and  Count).  This  operation  code  causes 
(1)  the  contents  of  the  entire  accumulator  to  be  shifted  to  the  left 
until  a  nonzero  digit  is  in  the  most  significant  place,  (2)  a  count 
of  the  number  of  places  shifted  to  be  inserted  in  the  two  low-order 
positions  of  the  accumulator.  This  instmction  is  to  aid  fixed-point 
scaling. 

Table  look-up  instructions 

84  TLU  (Table  Look-up).  This  operation  code  performs  an 
automatic  table  look-up  using  the  D  address  as  the  location  of 


Chapter  17  |  IBM  650  instruction  logic  223 


the  first  table  argument  and  the  I  address  as  the  address  of  the 
next  instruction  to  be  executed.  The  arginnent  for  which  a  search 
is  to  be  made  must  be  in  the  distributor.  The  address  of  the  table 
argument  equal  to,  or  higher  than  (if  no  equal  exists)  the  argument 
given  is  placed  in  positions  8-5  of  the  lower  accumulator.  The 
search  argument  remains,  unaltered,  in  the  distributor. 

Miscellaneous  Instructions 

00  No-Op  (No  Operation).  This  code  performs  no  opera- 
tion. The  data  address  is  bypassed,  and  the  machine  automatically 


refers  to  the  location  specified  by  the  instniction  address  of  the 
No-Op  instruction. 

01  Stop.  This  operation  code  causes  the  program  to  stop 
provided  the  programmed  switch  on  the  control  console  is  in  the 
stop  position.  When  the  programmed  switch  is  in  the  run  position 
the  01  code  will  be  ignored  and  treated  in  the  same  manner  as 
00  (No-Op). 

References 

Type  6.50  Magnetic  Drum  Data-Processing  Machine  Manual  of  Operations; 
HughE.54;  SerrR62. 


Section  3 

Processors  for  variable-length-string  data 


Although  only  two  computers  are  described  in  this  section,  the 
reader  might  refer  to  other  computers  in  the  book  which  handle 
variable-length  strings.  The  IBM  System/360  processes  a  string 
whose  length  is  specified  in  the  instruction.  The  Burroughs 
B  5000  has  a  very  nice  string  data  ISP  (both  simple  and  power- 
ful). 

Variable-length  strings  imply  some  method  to  specify  at  in- 
struction execution  time  the  actual  length  of  the  character 
strings  being  processed.  Which  method  is  used  has  a  substan- 
tial effect  on  the  ISP  of  the  resulting  machine,  and  it  is  note- 
worthy that  a  wide  variety  of  devices  has  been  tried  without  any 
apparent  consensus  yet  on  the  appropriate  mechanism: 

1  An  extra  bit  in  each  character  to  mark  the  string  bound- 
ary (IBM  1401) 

2  A  special  terminal  character  to  mark  the  string  boundary 
(IBM  702) 

3  A  field  variable  in  the  instruction  to  specify  the  string 
length  (IBM  System/360) 

4  A  register  variable  in  the  processor  to  specify  the  string 
length  (an  8-bit-character  computer— Chap.  10) 

5  A  fixed  number  of  characters  at  the  head  of  the  string 


to  specify  the  length  (and  data  type)  of  the  string  (used 
extensively  for  variable-length  records  on  tape  and  disk, 
though  we  know  of  no  ISP  that  uses  it) 

The  IBM  1401 

The  1401  was  IBM's  most  popular  computer,  measured  by 
quantity  produced,  prior  to  the  1130/1800  and  System/360. 
However,  the  authors  of  this  book  were  unable  to  find  any 
technical  papers  on  its  design  or  design  philosophy.  The  1401 
is  based  on  earlier  business-oriented  computers  (Fig.  1,  page 
225).  It  evolved  a  great  deal,  as  can  be  seen  from  the  number 
of  "features"  which  can  be  appended  to  improve  it.  Successors, 
the  1440  and  1460,  are  also  improvements.  It  is  assumed  that 
early  computers  mainly  influence  successor  computers  within 
the  same  organization. 

An  8-bit-character  computer 

An  8-bit-character  computer  (Chap.  10)  has  been  suggested  by 
the  authors.  It  is  a  very  restricted  computer  for  processing 
string  data  and  illustrates  another  approach  to  string  defini- 
tions; the  string  length  is  specified  by  a  variable  in  the  proc- 
essor. 


Chapter  18 


The  IBM  1401 


The  second-generation  transistor-technology  IBM  1401  has  been 
included  both  because  a  large  number'  have  been  produced  and 
because  it  differs  from  common  fixed  word  length  binary  and  deci- 
mal computers.  IBM  1401s  are  used  in  business  data-processing 
applications  requiring  variable-length  character  strings  or  fields 
and  rather  limited  calculating  ability.  Two  specific  applications 
are  as  a  card  processor  in  making  a  transition  from  plugboard 
programmed  calculators  to  full-scale  automatic  computations  and 
for  converting  data  from  one  medium  to  another,  for  example,  from 
card  to  tape.  The  1401  was  little  used  bv  the  scientific,  engineer- 
ing, and  scientific  business  data-processing  communities,  probably 
because  of  the  limited  Mp  size,  the  low  overall  processing  speed, 
and  the  lack  of  concurrent  I/O  operation  in  the  smaller  configura- 
tions. However,  it  did  achieve  considerable  use  as  a  stand-alone 
Cio  in  C('7090)  installations,  perhaps  because  of  the  speed  and 
quality  of  the  T('1403;  line;  printer). 

.\lthough  undoubtedlv  influenced  bv  machines  outside  the  IBM 
organization,  the  IBM  1401  is  derived  primarily  from  the  IBM  702 
and  70.5,  which  are  variable  word  length  decimal  machines.  The 
relationship  of  the  various  IBM  decimal  computers  to  one  another 
is  shown  in  Fig.  1.  (RCA's  earlv  computers'  also  use  a  combination 
of  fi.xed-length  and  variable-length  7-bit  character  strings  and  may 
have  influenced  the  1401.) 

The  IBM  I401's  ISP  was  the  first  to  be  adopted  by  another 
company.  Honeywell  defined  its  H-200  ISP  to  be  a  superset  of  the 
IBM  1401  ISP.  The  ISP  of  the  H-2()0  is  more  complex  and  increases 
performance  bv  organizing  Mp  bv  both  characters  and  words. 

The  IBM  1401,  1440,  and  1460  are  the  only  IBM  computers 
to  be  completelv  character-string  oriented.  That  is,  both  instruc- 
tions and  data  are  stored  in  variable-length  character  strings;  these 
strings  are  addressed  bv  a  pointer  register  to  the  string.  The  ad- 
dress integer  is  fixed  at  three  characters.  The  encoding  process 
for  addresses  is  given  in  .\ppendix  1  of  this  chapter.  The  .3-char- 
acter  address  (.3  x  6  bits)  is  assigned  as  3  X  4  bed  characters  for 
encoding  addresses  0:999;  2x2  bits  for  selecting  16  x  1,000 
addresses;  and  2  bits  for  selecting  one  of  the  three  index  registers. 

The  IBM  1620  processes  variable-length  data  strings,  although 

'Up  to  1966,  more  1401s  were  produced  than  any  other  model.  An  esti- 
mated 7,500  1401s,  1,500  1401  G's  (card-only  system),  3,600  1440s.  and  1.500 
I46()s  were  produced.  .Xboiit  1,800  1620s  were  produced. 
-RC.\  301,  501,  and  601. 


the  instruction  length  is  a  fixed  12-digit  string  corresponding  to 
a  word  in  .Mp.  The  1620,  though  not  identical  to  the  1401.  is 
almost  a  member  of  the  same  familv. 

The  1401  evolved.  Figure  1  shows  the  evolution  of  "features" 
which  have  created  new  computers.  The  I40rs  optional  features 
are  mainly  design  afterthoughts;  they  sometimes  increase  perform- 
ance, sometimes  make  certain  operations  possible,  and  sometimes 
provide  substantive  change.  There  are  approximately  19  features 
in  the  1401:  memory  expansion  beyond  the  anticipated  4,000 
characters  and  index  registers  required  encoding  the  field  bits  of 
the  A  and  B  addresses;  store  A- Address  and  store  B-.\ddress  register 


FiMd  -  lenglti  insnuctw, 
variable  -  oofocter 


IntofmatKyi  fetnevol J 


I  7070  7074  7072 


I  1620     1710  1620  m 


702  705  705  III 


XR.disk, mo^netic  tape 

305{RAMaC)U—  Tedinoloqy 


secord  generotio 


p^rc^ea-    CPC  607 


609  608,610 1 


Technology:  vacu 
first  genefat.o 


/ear  of  first  delivery 
C  [  Honeywell  H-2D0,  dota   w.chor  string  ,  2  ^s  /chor .  1401  Campat'tfe) 

C('14i0, 10^.80  kchor.  45  ^s/chor.  Mps  (15  »  5  chor ) ; '401  Compatible) 

C  1'7010;  40'\'100  kchor,  1.2  /is /char;  data   w.  char,  string,  1401,1410  compatible) 

CI  I401,  4n.i6  kchor-,  11. 5^s/char,  8  b/chor ,  2  address,  Mps  C^e  cnor)-, 
storage  to  storage  instructions),  C('1460-,  6/i.s/char )  ■,  C('1440-,  11.1  ^s/chorl 

CC7070.  6^s/»<,  S-VIO  kw;(10,1sign)d/w,  5b/d:  1  oddress /instruction,  Mps  (99  'XRH 
C('7074  ,  6/is/w,5'\.  30kw),  C(7072;  4;js/w.  5^  30  kw) 


C('702,20'\.60  h 


.  MPS  (512  char 
hw,(1+1)  address/ 


Fig.  1.  IBM  decimal  and  character-string  computer  relationships. 


225 


226  Part  3     The  instruction-set  processor  level;  variations  in  the  processor 


Section  3  |  Processors  for  variable-length-string  data 


instructions  are  necessary  for  subroutines — the  Store  Address  Regis- 
ter Feature;  Indexing  Feature;  Multiply-Divide  Feature;  High- 
Low-Equal  Compare  Feature;  Read  Release  and  Punch  Release 
Feature;  the  Column  Binary  Feature;  Early-Card-Read  Feature; 
Processing  Overlap  Feature,  etc. 


PMS  structure 

The  1401  PMS  structure  (Fig.  2)  is  an  early  1  Pc  structure.  The 
diagram  does  not  show  the  S(fixed)  Pc  interconnection  structure 
with  the  Ms  and  T.  The  Pc-(Ms|T)  interconnection  restricts  the 
concurrency  of  T  and  Ms.  The  optional  processing  overlap  feature 
provides  a  link  to  Mp  to  allow  the  T(card;  read,  punch)  to  be  run 
concurrently  with  Pc  processing.  When  any  of  the  peripheral 
devices  are  operating  without  the  processing  overlap  feature,  the 
Pc  is  dedicated  to  be  a  data  transmission  link  or  K  (as  in  earlier 
computers).  The  device  K  is  connected  directly  to  Pc.  For  example, 
Ms(disk,  magnetic  tape)  data  transfers  use  the  main  registers  of 
the  Pc  and  can  tie  it  up  full  time  during  data  transmission.  By 
carefid  programming,  several  devices  can  be  synchronized  and 
thus  nm  concurrentlv  for  communicating  with  Pc  from  a  K.  The 
Pc  does  not  have  an  interrupt  system.  Thus  the  peripherals  have 
no  way  of  communicating  with  Pc.  Sub.sequent  models,  the  1440 
and  1460,  added  internipt  capability  and  made  it  easier  to  control 
multiple  simultaneous  data  transfers  among  the  peripheral  K's 
and  Pc. 


j  T  .  consol  e- 

PcL,  T(' 11)02;  card;  reade  r,  punch)- 

T( '  11)03  I'  11)01);   1  ine;  pr!nter)-> 
T('ll)07    Console   Inquiry  Station;  typewriter)- 
T(pap,er  tape;  reader)^ 
Ms(#I:6;  magnetic  tape)- 
I  M5('ll)05;  disk) 

^Pc(string;   1'-  8  char /i  ns t ruct ion ;  M. processor  state 

(7.^  16  char);   technology;  vacuum  tubes;   1960~  1965; 
descendants:  11)1)0,  11)60) 

==Mp(core;   11.5  iis/char;  1)000  ~  16000  char;    (7,1  parity) 
b/char) 


Fig.  2.  IBM  1401  PMS  diagram. 


ISP  structure 

The  IBM  1401  ISP  is  given  in  Appendix  1  of  this  chapter.  Instruc- 
tion strings  and  data  strings  are  delimited  by  the  special  F  bit 
in  a  character.  A  character  in  Mp  is  of  the  form' 

C<check,F,B',A',  8,  4,  2,  1> 

An  n-character  string  is  C[0],  C[l],  .  .  .  C[n  —  1] 

and  would  be  stored  in  Mp[j;]  -|-  n  —  1] 

The  first  character  (or  head)  of  an  instruction  must  contain  the 
word-mark  flag  or  F  bit.  The  head  «f  the  instruction,  which  is  to 
be  interpreted  next,  is  held  at  Mp[I],  and' succeeding  characters 
of  the  instruction  are  at  Mp[I  -I-  1],  Mp[I  +  2],  etc.  Correctly 
defined  instructions  are  1,  2,  4,  5,  7,  and  8  characters  long.  Un- 
defined instruction  lengths  of  up  to  8  characters  are  also  inter- 
preted without  an  error  condition.  The  interpretation  algorithm 
presented  in  the  ISP  description  does  not  explain  the  action  of 
instructions  which  have  an  incorrect  length.  Actually,  the  1401 
Reference  Manual  does  not  go  into  details  of  general  instruction 
interpretation  but  dwells  on  "correct"  operation.  Table  1  presents 
the  correct  instruction  lengths  and  formats.  If  we  take  the  instruc- 
tions in  the  table,  the  set  is  not  variable  in  length  but  is  fixed  at 
these  six  sizes.  The  instruction  set  (not  including  the  input/output 
instnictions)  is  presented  in  Table  2.  This  table  also  provides  a 
hint  of  the  implementation,  since  the  execution  times  are  given 
in  terms  of  memory  cycles. 

The  ISP  state,  unlike  that  of  more  conventional  processors,  has 
no  temporary  operand  storage  (e.g.,  accumulators).  The  ISP  state 
has  registers  which  point  to  operands.  The  state  of  the  machine 
(see  Appendix  1)  is  basically;  Mp,  the  Instruction  Location  Counter, 
Indicators  or  miscellaneous  bits,  three  .3-character  blocks  of  Mp 
reserved  for  Index  registers,  and  the  two  registers  A^address  and 
B„address  which  point  to  data  operands. 

Instruction  interpretation 

There  are  three  principal  state  types  in  processing  an  instniction; 
o.q.,  when  the  instruction  is  being  formed;  o.v.,  when  the  operands 
are  being  accessed  or  the  results  are  being  stored  in  Mp;  and  o, 
when  the  operation  specified  by  the  instruction  is  being  carried 
out.  Each  state  transition  corresponds  essentially  to  a  memory 
access.  The  three  instruction  types  of  Fig.  3  each  have  their  own 
particular  states.  Only  types  1  and  2  process  the  variable-length 

'  See  Appenclix  1  of  this  chapter  for  the  meaning  of  the  bits  in  a  character. 
We  have  renamed  the  A  and  B  bits  A'  and  B'  to  avoid  confusion  with 
the  registers. 


Chapter  18  |  The  IBM  1401  227 


Table  1    IBM  1401  instruction  formats 


Length  Location: 

(char)  M[l]  M[(l  +  l):(l  +  3)]  M[(l  +  4):(l  +  6)]  M[l  +  7]  Types 


1  C[0] 

2  c[0]  qi] 

4  C[0]  C[l,2,  3] 

5  C[0]  qi.2.  3] 

7  qo]  qi,  2, 3] 

8  qo]  qi.2, 3] 


q4] 

q4.  5,  6] 

q4,  5,  6]  C[7] 


no-op,  halt,  or  single  character  to  specity 
a  chained  instruction 

the  d^character  Is  used  to  specify  addi- 
tional Instruction  information  {e.g., 
select,  card  stacker) 

unconditional  branch  instruction  or  sin- 
gle address  arithmetic;  M[A]  ^f(M[A]) 

conditional  branch  instruction;  C[4]  se- 
lects a  specific  test 

two  address  instruction; 

IV1[B]  ^  M[B]  b  M[A];  (e.g.,  add,  sub- 
tract) 

conditional  branch  based  on  IVlp[B]  char- 
acter; d„character  is  test  character; 
(e.g.,  branch  If  character  equal) 


Function  of  instruction  characters; 

C[0]  op  code;  always  contains  a  word-mark  flag  or  F  bit. 

C[l,  2,  3]  =  branch  address  for  l„Address  register  or  first  operand  address  for  the  A_Address  register. 

qi]  or  q4]  or  C[7]  d„character;  used  as  a  single  character  for  additional  operation  code  information  or  a  character  for  comparison,  or  to 
select  a  test. 

q4,  5,  6]  primary  operand  (B_Address  register  specification). 


character  strings,  {char.string},  and  the  state  diagram  accounts  for 
strings  on  a  character-at-a-time  basis.  For  an  add  instruction 
Fig.  3  oversiniphfies  the  execution  because  it  implies  that  each 
character  of  the  A  and  B  operand  is  acces.sed,  the  addition  is  per- 
formed, and  the  resuU  is  restored  according  to  the  B,_address 
register.  A  more  complex  description  must  account  for  .\  and  B 
strings  of  unequal  length,  and  the  case  of  getting  a  number  which 
must  be  recomplemented  because  it  is  the  wrong  sign.  The  re- 
complementation  process  requires  a  reverse  scan  to  find  the  end 
of  the  B  string  and  then  a  forward  scan  to  recomplement  each 
character  of  B.  Figure  4  is  a  detailed  state  diagram  of  the  add 
execution  process. 

The  states  in  the  ISP  description  (Appendix  1)  within  the  in- 
struction-interpretation process  correspond  to  the  three  state  types 
just  described:  the  single-instruction  character-fetch  operation,  the 
fetch-operand-addresses  for  the  remainder  of  the  instruction,  and 
Instniction^execution.  Instruction_e.xecution  is  not  given  in  any 
detail.  For  example,  the  execution  of  add  is  defined  as  "\\"(;  = 
op  =  110001)  OvnM[B]  ^  M[B]  +  M[A]  (char.string);.  The 
state  diagram  (Fig.  4)  presents  this  e.xecution  in  detail.  Note  that 
in  the  ISP  description  we  omit  telling  the  reader  that  the  A  and  B 


address  registers  point  to  the  next  lowest  variable-length  string  in 
M  after  an  operation  is  performed.  We  allow  the  definition  of  a 
variable-string  operation,  for  example,  +  {char.string},  to  imply 
the  action  on  the  processor  state. 

Some  instmctions  can  be  defined  with  a  single  character,  and 
these  are  called  chained  instructions.  Chained  instructions  take 
the  previous  values  of  the  pointer  registers,  the  A  and  B  address 
registers,  as  the  operand  addresses.  The  add  instruction,  for  e.xam- 
ple,  can  be  either  1  (chained),  4,  or  7  characters;  the  forms  of  all 
instructions  appear  in  Table  1,  The  4-character  add  instruction 
places  the  A  address  field  in  both  the  A  and  B  address  registers; 
thus  the  effect  is  an  instruction  to  double  a  string  (add  it  to  itself). 

Data 

An  n-decimal-digit  numeric  data  string  is  represented  as 

C[n  -  1],  C[n  -  2]  C[I],  C[0],  C[M] 

The  underlined  characters,  C[n  —  I]  and  C[.\l].  have  the  flag  bit 
present,  that  is,  (C[n  -  1]<F>  =  1)  and  (C[M]<F>  =  1),  The  n 
characters  are  stored  in  locations  Mp[j],  Mp[j  -1-  1], .  ,  . ,  Mp[j  + 


228  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  3  |  Processors  for  variable-length-string  data 


Table  2    IBM  1401  instruction  set  (excluding  input,  output) 

Oi) 

Execution  time 

l^cnoth 

Data 

LTlStrtiCTlOtl 

Code] 

iu  mcTfioTy  cycles\ 

(char.) 

type 

Add  (no  recomplement) 

A 

L,  -1-  3  +  La  +  Lb 

1.  4,  7 

char,  string 

Add  (recomplement) 

A 

L,  -t-  3  +  La  +  4Lb 

1,  4,  7 

char,  string 

Branch 

B 

Li  +  1 

4 

3  char 

Branch  if  Bit  Equal§ 

W 

L,  +■  2 

8 

1,  3  char 

Branch  if  Character  Equal 

B 

L,  +  2 

8 

1,  3  char 

Branch  if  Indicator  On 

B 

L,  +  1 

5 

1,  3  char 

Branch  if  Word  Mark  and/or  Zone 

V 

L,  -H  2 

8 

1,  3  char 

Clear  Storage 

/ 

Li  -1-  1  -f  L, 

1,  4,  7 

char,  string 

Clear  Word  Mark 

tJ 

L,  +  3 

1,  4,  7 

1  char 

Compare 

C 

Li  -1-  1  -1-  La  +  Lb 

1,  7 

char,  string 

Divide  (aver.)§ 

% 

li  +  2  +  7LrLq  8Lq 

7 

char,  string 

Halt 

Li  +  1 

1 

Load  Characters  to  A  Word  Mark 

L 

L,  4-  1  +  2La 

4,  7 

char,  string 

Modify  Address§ 

# 

Li  +  9 

4.  7 

3  char 

Move  Characters  to  A  or  B  Word  Mark 

M 

Li  +  1  -1-  2L.. 

4,  7 

char,  string 

Move  Characters  and  Edit 

E 

Li  -1-  1  +  La  +  Lb  -1-  Ly 

7 

char,  string 

Move  Characters  to  Record  or  Word  Mark§ 

P 

Li  -f  1  +  2La 

7 

char,  string 

Move  Characters  and  Suppress  Zeros 

Z 

Li  +  1  +  3La 

7 

char,  string 

Move  and  Insert  Zeros§ 

X 

U  +  1  +  21La  +  2L, 

7 

char,  string 

Move  Numeric 

D 

Li  +  3 

1.  7 

1  char 

Move  Zone 

Y 

Li  4-  3 

1,  7 

1  char 

Multiply  (aver.)§ 

@ 

Li  +  3  -H  2Lc  -H  SLcLm  -|-  7L„ 

7 

char,  string 

No  operation 

N 

Li  +  1 

1 

Set  Word  Mark 

U  +  3 

4,  7 

1  char 

Store  A-Address  Register§ 

Q 

Li  +  5 

4 

3  char 

Store  B  Address  Register§ 

H 

Li  +  4 

4 

3  char 

Subtract  (no  recomplement) 

S 

L,  +  3  +  La  +  Lb 

1,  4.  7 

char,  string 

Subtract  (recomplement) 

8 

Li      3  -1-  La  -H  4Lb 

1,  4,  7 

char,  string 

Zero  and  Add 

7 

L,  -H  1  H-  La  -(-  Lb 

1,  4,  7 

char,  string 

Zero  and  Subtract 

! 

Li  +  1  +  La  +  Lb 

1,  4,  7 

char,  string 

t Alphanumeric  code  used  to  specify  instruction. 
tM(t.cycle:  11.5 /iS/char) 
§Optional-feature  instructions. 


Abbreviations  for  symbols  used  in  timing: 


La 

length  of  the  A  field  (in  characters) 

Lb 

length  of  the  B  field 

Lc 

length  of  multiplicand  field 

L, 

length  of  instruction 

Lm 

length  of  multiplier  field 

Lq 

length  of  quotient  field 

Lr 

length  of  divisor  field 

Ls 

number  of  significant  digits  in  divisor  (excludes  highorder  Os  and  blanks) 

Lw 

length  of  A-  or  B-field,  whichever  is  shorter 

Lx 

number  of  characters  to  be  cleaned 

Lv 

number  of  characters  back  to  rightmost  0  in  control  field 

Lz 

number  of  Os  inserted  in  a  field 

2 

number  of  fields  included  in  an  operation 

Chapter  18  |  The  IBM  1401  229 


No  terminotion  Slort  key 

Character  for  q 


Type  1-  Type  2 

M[B]»f(MCA],MCB:,{char  string))    M  CB:-f  (MC  AD.Cctiar  string)) 

NOTE  The  time  in  each  state  is  roughly  1  memory  cycle 
q       The  instruction  q 

oq     Operation  ond  memory  access  to  determine  instruction  q,a  correct  tength 

instruction  =  I,  2,  4,  5,  7,  and  8  characters 
o.v    Operation  ond  memory  access  tetches  to  determine  on  operand 
0      Operation  specitied  m  the  instruction  q,  requires  no  time 
ov'    Operand  and  memory  access  stores  to  restore  result  operand 


Fig.  3.  IBM  1401  instruction-interpretation  state  diagram. 

n  —  1].  The  values  of  the  string  are  based  on  the  bed  vakie  of 
the  8,  4,  2,  1  bits  of  each  digit.  The  magnitude  of  the  integer  is 

C[n  -  1]  X  10"-'  +  C:[n  -  2]  X  lO""-  +  ■  ■  ■  +  C[0]  X  10" 

and  the  sign  is 

Sign       ((-iC[0]<A'>  A  C[0]<B'»^ 

-n(-iC[0]<A'>  A  C[()]<B'»^  +) 

A  string  is  addressed  (or  accessed)  via  the  A„address  or  Head- 
dress pointer  registers.  These  point  to  the  tail  (or  least  significant 
digit),  that  is,  C[0],  of  the  string.  The  instruction-execution  state 
diagram  of  a  variable-string  add  is  shown  in  Fig.  4.  The  state 
diagram  assumes  that  A  and  B  address  registers  are  set  up  accord- 
ing to  Fig.  .3.  Thus  Fig.  4  is  a  more  detailed  description  of  states 
o.v,  o.v,  o,  and  o.v'.  Each  horizontal  pair  of  states  (Fig.  4)  corre- 
sponds to  a  single  scan  of  the  states  of  type  1  instruction  o.v,  o.v,  o, 
o.v'  in  Fig.  3.  Transition:  among  states  2  and  3  correspond  to  the 


character-by-character  scan  with  string  A  and  B  being  added 
together;  the  result  string  is  placed  in  B.  States  4  and  .5  define 
the  string  addition,  when  string  A  is  terminated;  i.e.,  it  is  con- 
sidered to  be  zero.  States  7,  8,  9,  and  10  define  the  recomple- 
mentation  process  in  which  the  B  string  has  to  be  recomplemented. 
This  condition  occurs  when  the  operand  signs  differ,  and  the 
A-field  result  is  greater  than  the  B  field;  the  results  are  in  ten's 
complement  form.  States  7  and  8  define  the  B-field  scan  (to  return 
to  find  the  least  digit  of  B),  and  states  9  and  10  define  the  recom- 
plementation  of  each  character.  Thus  an  add  operation  may  re- 
quire up  to  three  scans  of  the  B  string. 

The  1401  ISP  (.\ppendix  1  of  this  chapter)  has  four  parts:  State 
Declaration,  Instniction-interpretation  process.  Instruction-exe- 
cution process,  and  Operand  address-register  calculation  proc- 
ess. The  Operand  address-register  calculation  process  is  analogous 
to  the  Effective-address  calculation  in  more  conventional  Pe  s  and 
is  the  most  elaborate  part  of  the  instruction  interpretation.  The 
operand  address  registers  A^address  and  B„address  are  part  of  the 
Pc  state  and  must  be  retained  between  instructions.  At  the  end 
of  an  instruction,  these  registers  point  to  the  character  of  the  next 
lowest  data  string  in  .Mp.  that  is,  the  character  at  C[n]. 

Implementation 

The  1401  has  a  small  Pc  state,  and  there  are  only  a  few  registers 
in  the  implementations.  Figure  5  shows  the  registers,  interregister 
transfer  paths,  and  data  operations  that  make  up  the  register- 


Fig.  4.  IBM  1401  add-instruction-execution  state  diagram. 


[B+l]<F>''  -,  u  [a+1]<F>— . 

-.m[8«i]<fvm  [a*i]<f>*-. 

:arry,M[B]-^M[Bji-COrry+0, 
B—  B  -1;  


Not  recome— 


M[B+l]<F>-i-; 


Initio!  stote ;  Operand 
oddresses  m  AuiAddress 
and  Bi-jAddress  registers 
pointing  to  A  ond  B  stnrgs 


chor  string  oddition 


A  string  hos  terrninoted 


>B  string  hos  terrninoted 


M[B]<''>-"(B— 8  -1  ) 

/ 

I  M[B]-^nM[B]  B— B  -1; 

10 

-.m[b  +  i]<f>-" 


M  [B  +  1  <F> 


□  f  B  string 


B  string 


Result  string, B, 
hos  wrong  sign 
ond  must  be 
recomplemented 


I  Final  state.  M[B]{chor  string} 
f   has  result  doto 


230  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  3  |  Processors  for  variable-length-string  data 


liOl  BASIC  SYSTEM 


COSE 
STORAGE 


ADDER 
AND 
LOGIC 


STORAGE 
ADDRESS 
REG 


CORE 
STORAGE 


ADDRESS 
MODIFIER 


UOl  TAPE  MULTIPLY-DIVIDE 


1401  PROCESS  OVERLAP 


STORAGE 
ADDRESS 
REG 


ADDER 
AND 
LOGIC 


PRINTER 


INHIBIT  DRIVE 


CORE 
STORAGE 


ADDRESS 
MODIFIER 


STORAGE 
ADDRESS 
REG 


ADDER 
AND 
LOGIC 


Fig.  5.  IBM  1401  system  data  flow  (registers  structure).  (Courtesy  of  /nfernat/ona/  Business  Machines  Corporation.) 


transfer  level  primitives  of  the  complete  computer  together  with 
several  options.  The  options,  of  course,  increase  the  complexity 
(and  concurrency).  Without  the  overlap  feature,  for  example, 
all  data  are  accessed  in  Mp  via  Pe  s  address  registers. 

There  are  register  pairs  consisting  of  a  3-character  memory 
address  (access)  register,  and  a  1-character  data  register.  The 
memory-address,  memory-data  register  pairs  are  A^address, 
A^data;  B„address,  B^data;  I_^address,  Operation/Op;  Overlap- 
„address,  Overlap„data/0. 


The  implementation  is  straightforward,  and  the  instniction 
times  (Table  2)  show  the  implementation  at  the  register-transfer 
level.  For  example,  as  an  instruction  is  being  read  by  Pc,  prior 
to  instruction  execution,  each  new  character  is  taken  in  and  ex- 
amined for  the  instruction-terminating  flag  bit.  When  the  flag  bit 
is  present,  the  instruction  is  complete  and  readv  to  be  executed. 
The  character  of  the  next  instniction  is  not  saved  but  is  picked 
up  again  after  the  previous  instruction  has  been  executed. 


Chapter  18  |  The  IBM  1401  231 


APPENDIX  1    IBM  1401  ISP  DESCRIPTION 


Appendix  1 
IBM  UOI   ISP  Description 

PCt  Pa  Console,  and  10  Device  Control  States 
The  following  description  is  a  highly  simplified  description  of  the  IBM  2401.     For  example^  the  edit  instruction  given  belou  in  onr 
line  corresponds  to  a  three  page  description  in  the  Reference  Manual  for  the  1401.    It  does  not  include  the  input-output  instruc- 
tions which  transfer  character  strings  to  fixed  blocks  of  primary  memory.     The  character  strings  are  denoted  as  character,  string/ 
ch.string/ch.s.    For  the  character,  string  operations  the  A^ddress/A  and  B^address/B  registers  contain  a  pointer  to  the  next  A  and 
B  strings  at  the  end  of  the  operations;  this  aspect  of  the  operation  is  not  described — but  iriplied  in  the  string  operations. 

I  [l  :3]<fi' ,A' ,8,^,2,1>  Z_,address  register,  the  instruction  location  pointer 

A[I  :3]<B',A',8,'i,2,)>  A^address  register 

BD:3]<fi*,A',8,^t,2,I>  B^address  register 

String  Data  pointer  registers  A  and  B  point  to  the  least  significant  digit  end  of  a  variable  length  stiring  in  memory  (see  Mp 
State  definition  below).    Normally  A  and  B  are  decreased  by  one  and  move  to  the  more  significant  end  for  variable  length  string 
[ch.s]  operations.  B  is  normally  the  result  string,  and  the  length  is  defined  by  a  word  mark,  F,  the  last  character  of  the  B 
string.    If  A  string  has  a  word  mark,  and  is  shorter  than  the  B  string,  then  the  remaining  A  string  is  taken  to  be  a  zero.     I  is 
a  pointer  to  the  most  significant  digit  of  the  instruction.    Although  Pc  register  characters  have  the  B' , A' ,8,4,2,1  bits,  the  M 
has  two  additional  bits  check,  and  field.    The  bits  of  Mp  are: 

Cheak/Parity^bit.     The  sum  (modulo  2}+l,  of  the  F,B' , A' ,8,4,2,1,  bits. 

WM/WordsJkirk/F/FieldLJ>it.     This  bit  defines  the  beginning  of  each  instruction^    The  F  bit  also  defines  the  most  significant 
digit  (the  last  digit)  of  a  variable  length  numeric  integer  string. 

B' , A' ,8,4,2,1  bits.    A  6  bit  character  is  encoded  in  these  bits.     If  numeric  data  is  represented,  the  8,4,2,1  bits  are  used 
as  a  bed  digit.     The  sign  is  encoded  with  the  least  significant  digit.     For  meneric  data,  a  minus  sign,        is  encoded  by 
(A'  =  0)  A  (B'  =  1).    All  other  corvbinations  o^  A',B'  revresent  a  vlus  sign,  +, 

XR0:riD:3]<B'.A',8,i*.2.1>t=  M  [8? :  89,92  rS'*  ,97 1 99]  <B '  ,A '  .  8 . 4 ,2 . 1  >    J  three  character  ovtional  index  registers  stored  in  !4p 


I ndi cators  [o : 63]  logical  bit  array  encoding  Pc  State  (not  including  I, A,  and  B) 

There  are  a  set  of  31  status  bits  of  the  possible  64.     They  can  be  cleared  or  set  under  instruction  control.    Some  Indicators 

are  used  by  external  Pc  status  or  I/O  status.  The  indicators  can  be  selected  for  testing  by  the  d  character  of  an  instruction. 
The  Pc  indicators  assignment  to  Pc  State  is: 

Unconditional   :=  1  always  a  1 

Sense^witch*^,B,C,D.E,F,G>  a  set  of  7  console  keys 

Unequali-iCompare  B  ^  A 

Equal^ompare  B  =  A 

Low^ompare  B<  A 

High^ompare  B>  A 


Overflow 

The  indicator  array  is  partially  encoded  below: 
Indicator  [000000]   :=  Unconditional 
Indicator  [nOOOl]   :=  Sense^swi tch<A> 

Indicator  [010001]   :=  Unequal^compare 

Indicator  [01 1001]   :=  Overflow 


set  by  arithmetic  overflow,  cleared  by  a  branch  instruction  if 
it  is  set 


I4p  State 

M[0:I5999]<Check,F,B' ,A' ,8,4.2, I> 

address[X[l  :3]<B' ,A' ,8,it.2,J>]<]  :5>,^  :=  ( 


prtmaru  memory 

Address  encoding  for  1  of  16000  from  a  3  char  value  of  regis- 
ter X.     Indexina  described  belov. 


232 


Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


Section  3  |  Processors  for  variable-length-string  data 


APPENDIX  1    IBM  1401  ISP  DESCRIPTION  (Continued) 


x[3]<B'  ,A'>  X  ^000,,  + 
x[l]<B'  ,A'>  X  1000,0  + 
X[l  :3]<R,it,7,l>[bcd.strinq]) 

Instruction  Format 

op<F,  B '  .A '  ,8,i(  ,2  . 1  >  instruction  register  specifying  the  operation 

duj:har<F,  B '  ,A  '  , 8 , ^4 ,2 , 1  >  additional  character  used  in  some  instiructions 

d^char^present  indicates  a  dj:har  is  used  in  the  current  instruction 

active  indicates  an  instruction  string  is  still  being  fetched 

A^ddress^present  indicates  there  is  an  A  address  part  of  an  instruction 

B^ddress^present  indicates  there  is  a  B  address  part  of  an  instruction 

Move,  loadi  and  stove  instruction  types  control  the  initialization  of  A  and  B, 

move  or  load  or  store  A  or  B/mls  :=  {(move  characters  and  edit  =  op)  v  (toad  characters  to  A  word  mark  =  op)  v   (move  characters 
to  A  or  B  words  mark  =  op)  V  (move  characters  and  suppress  zeros  =  ov)  V  (move  numerical  =  oo)  V  (move  zone  =  op)  V  (store  A 
address  register  =  op)  v  (store  B  address  register  =  op}) 

Instruction  Interpretation  Process 

Run  ^  (op  ^m[|]:    |  ^  |  +  1  ;  next  fetch  operation 

Fetch^operand,_,addresse5 ;  next  fetch  addresses  for  A  and  B 

I ns  t  ruct  ionucxecut  i  on )  execute 

Address  Calculation  Process 

The  1401  calculates  explicit  effective  addresses  by  first  setting  up  the  A,  and  B  address  registers.     Operands  are  not  fetched 
in  Instruct ion^xecution.     There  are  1,2,4^5^7  and  8  character  instructions  which  have  the  op  and  the  following  operands 
(respectively) :    no  char^  d  char^  the  I  or  A  address^  the  I  or  A  address  and  d  char,  the  A  and  B  address^  and  the  I  or  A  address 
and  B  address  and  d  char.     The  following  process  defines  the  operation  for  correct  length  instructions. 

Fetch^ope  rand^addresses   : =  ( 

d^har^present  <-  0; 

m[|]<F>  ^  (active  *- 0) ;  1  char  instruction 

~'M[|]<F>      (active  ^  1  ;  — ,  mis      B  ^  0) ;  next  proceed  to  get  an  I  or  A  address 

active       {d^char      get^char;  next  a[i]  <_  d^char;  I  or  A  address  set  up  or  d^char 

d^char^_,present  «-  I  ;  next 
-,m1s        (b[i]   ^a[i]));  next 
active  -»  (a[2]  ^get^char;  next  ~i  ml  s  ^  b[2]   ^  a[2]  )  ;  next 
active  ^  (A[3]       get^char;  next      mis       B[3]   ^  a[3]  ):  next 

active  ->  (A^ddress^present      1 )  ;  record  whether  I  or  A  address  is  present 

active  —  (A^address^present  ^O):  next 

A^address^present       (d^char^present  *-  0;  add  index  register  to  I  or  A 

(a[2]<B'  .A'>  7*  0)  ^  (a  ^  a  +  XR[A[2]<B'  ,A'>]    [3.ch]))  ; 
-iM[|]<F>  ->  (B  *-0);  next  B  address  set  up  or  djchar 

active       (djchar  ^-get^char;  next  b[i3  djchar; 

d^char^present       I ) ; 
active  -»  (b[2]  ^get^char);  next 
active  -*  (b[3]  '-get^char);  next 

active  ->  {Bwaddresswpresent  t-  ) ) ;  record  whether  B  address  is  present 

—I  active  ->  (B^ddress^present  ^0);  next 

B^addresSwpresent      (  add  index  register  to  B 

d^char^present  f-O; 

(b[2]<B'  ,A'>  ?f  0)       (B  ^  B  +  XR[b[2]<B'  ,A'>J  [  3.chl)); 


Chapter  18  I  The  IBM  1401  233 


APPENDIX  1    IBM  1401  ISP  DESCRIPTION  (Continued) 


(-1  M  [l  ]  <F>  A  act  i  ve)  ^  (d^har      get^char;  ^inal  d^char 

d^char^present  «-  1 )  ;  next 

(-n  m[i]<F>  a  active)  ->  Run  *-0;  halt  if  more  than  8  char  iristruction 

)  end  Fetch,uPpejyindjaddre88e8 

get^aharaoter: 

A  sub-proaess  used  to  fetch  each  new  character  in  the  instruction.    If  ^  is  found  in  a  character,  the  vrocess  terminates. 
qet„char<B' .A' ,8.^.2,I>  :=  { 


,  m[|]<F>  a  active  _  {m[|] 
m[|]<F>  -»  active  «-  O) ; 


1): 


value  ie  present  character 
no  value,  terminate 


^y^P.  -  character  string  (ch.s) 


Instruction  Set  and  Instruction  Execution  Process 
Instructionjexecution  :=  ( 

character  string/ch. e  movement  and  clear  memory: 
"M"   (:=  op  =  100100)  -»  (m[b]  ^  M  [a]  [ch.s]); 
"Z"   {:=  op  =  011001)       (m[b]  ^h[a]    {ch.sl;  next 

m[b]  ^f(H[B])  [ch.sj); 
"L"  (:=  op  =  100011)       (M[e]  4-M[a]  [ch.s]): 
"E"  (:=  op  =  110101)  -  {m[b]  «-f  (m[a],m[b].  [ch.sl)); 
This  instruction  moves  the  A  field  string  to  the  B  Held  string  lender  oontrvl  of  an  edit  character  string  in  the  orn-ginal  B  field. 
"/"  {:=  op  =  OIOOOI)       {m[b]  .-0   [ch.s. mod. 1001; 


load  characters  to  A  word  mark 
move  characters  and  edit 


-^  B^ddress^present  -♦  : 

B^ddress^present  -*  I  «-A): 
character  string,  {ch.8'\,  arithmetic: 


clear  storage,  ignores  the  ^  rr>ark  and  moves  to  next  modulo 
WO  address 

clear  storage 

clear  storaae  and  branch 


"A"  ( 

=  op  =• 

1 10001) 

(Ov,m[b]  <-h[b 

m[a]  {ch.sl) 

add 

"S"  ( 

=  op  = 

OlOOtO) 

(Ov,m[b]  *-M[B 

]  - 

«[a]  [ch.sD 

subtract 

.,,„  ( 

=  op  = 

lOlOlO) 

{m[b]  ^0 

m[a]  {ch.sj) 

zero  and  subtract 

'7"  ( 

=  op  = 

1 1 1010) 

(h[b]  ^0 

h[a]  (.ch.sl) 

zero  one*  add 

'a"  ( 

=  op  = 

001 100) 

(Ov,m[b]  ^m[b 

X 

m[a]  (ch.s]) 

multiply;  full  length  product  in  m\b]^  special  hardware 

•v  ( 

=  op  = 

01 1 100) 

(Ov,m[b]  ^m[b 

/ 

m[a]  [ch.s]) 

divide;  Quotient  and  rerminder  both  end  ud  in  M[F]. 

•r  ( 

=  op  = 

00101 1 ) 

(m[b]  .-M[8]  + 

0  [3-ch1; 

rrfodifu  address 

B  ^B  -  3:  A        -  3); 
brofiches J  halt,  no-operation: 
"N"   (:=  op  =   lOOIOI)^  ; 

(:=  op  =  111011)-»    (Run      0 ; 

-1  A^addresSu-present  -»  ; 
Atjadd res Stjp resent  -»  I 

"B"   (:=  op  =  1  10010)  -»  ( 

(-,  B^address^present  A  -i  d^char^present)  -*  I 
(-.  B^address^present  A      d^char^present)  ( 
Indicator   [f(d^char)]       (j  *-A); 
Indicator   [f{d^char)]  -0); 
(B^ddres^  present  A  d^chac  present)  ( 
B  -  1  ; 

(M  [b]  =  d^char)  ^   I  ^  A)}  ; 


no  operation 


halt 

halt  and  branch 


branch 

branch  if  indicator  on 


branch  if  char  eoual 


234  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  3  |  Processors  for  variable-length-string  data 


APPENDIX  1    IBM  1401  ISP  DESCRIPTION  (Continued) 


"V"   (:=  op  =  010101)  ^  (B  t-B  -   1  : 

hrn^ioh  if  viovf^  tvcivV.  qyicI/ ov  zofxe 

M[B]<f  {d^char)>  -  (1  ^A)); 

L       \  . —    op    —     1  1 UU 1  1  J  — *  \ 

Indicators  <— M  [/\]  =  M  [b]  [ch.s]) 

"Q"   (:=  op  =  101000)  ->  ( 

stove  A  address  reg-tstCT 

m[A  -  2:A]  <_A[1  :3]  ;  a  <_A  -  3); 

"H"   (:=  op  =  11 1000)  -»  ( 

stove  B  address  register 

m[A  -  2:A]      B[1  :3]  ;  a  ^  a  -  3) ; 

single  character  operations 

","  (:=  op  =  011011)  ^  (m[a]<F>  <-  1 

M  [b]  <F>  <-  1  ; 

set  woTd  wavk. 

A  ^  A  -  1  ;  E 

^  B  -  1  )  ; 

"2"   (:=  op  =  111100)  -  (m[a]<F>  <- 0 

M  [b]  <F>  <-  0 ; 

clear  word  mark 

A  *-A  -  1  ;  E 

<-  B  -   1  )  ; 

"D"  (:=  op  =  110100)  ->  (M[B]<fl,l4,2,l>      M  W<8 ,  1|  ,2  ,  1>; 

move  numerical 

A  ^  A  -   1  ;  E 

B  -  1)  ; 

"Y"   (:=  op  =  011000)  ->  (m[b]<B',A'> 

<-m[a]<B'  ,A'>; 

move  zone 

A  ^  A  -  1  ;  E 

) 

«-  B  -   1  )  ; 

end  Instructionx^xeoution 

Section  4 


Desk  calculator  computers: 
keyboard  programmable  processors 
with  small  memories 

These  stored  program  computers  have  interesting  features. 
For  example,  the  keyboard  is  utilized  several  ways: 

1  T. console  mode;  a  conventional  console  for  entering  data 
in  response  to  a  stored  program 

2  Program  entry  mode;  a  device  for  creating  stored  pro- 
grams 

3  Desk  calculator  mode;  a  part  of  the  arithmetic  (data) 
element  by  issuing  direct  instructions  and  thus  obtaining 
results  directly  independent  of  a  program 

Uses  2  and  3  are  both  internally  and  externally  programmed. 
The  data  types  are  decimal  (both  fixed  and  floating)  because 
of  the  intimate  interface  they  require  to  the  user.  Some  calcu- 
lators interpret  nested  (parenthesized)  algebraic  expressions. 

These  calculators  easily  meet  the  definition  for  a  stored- 
program  computer.  It  is  apparent  their  designers  know  a  great 
deal  about  general  purpose  stored  program  computers.  The 
machines  are  cleverly  designed  and  make  efficient  use  of  the 
hardware  they  possess.  Eventually  there  may  be  more  of  these 
computers  than  conventional  stored  program  computers.  The 
reader  should  note  that  not  all  "electronic  desk  calculators" 
are  computers;  most  are  electronic  versions  of  their  mechanical 
or  electromechanical  ancestors. 


The  OLIVETTI  UNDERWOOD  PROGRAMMA  101  desk  calculator 

The  Programme  101  (Chap.  19)  is  at  the  limit  of  what  we  call 
a  stored  program  computer.  It  has  a  sufficient  instruction  set 
to  be  classified  as  a  computer,  but  the  storage  for  temporary 
data,  constants,  and  programs  is  limited.  The  machine's  in- 
struction set  is  interesting  because  memory  is  not  addressed 
explicitly.  A  jump,  for  example,  is  executed  by  scanning  the 
program  for  a  particular  marker  which  was  named  in  the  jump 
instruction.  The  Programma  101  uses  an  Mp. cyclic. 

The  program  library  for  the  Programma  101  is  extensive  and 
provides  an  indication  of  its  capability. 


The  Hewlett-Packard  Model  9100A  computing  calculator 

The  HP  9100A  (Chap.  20),  like  the  Programma  101  (Chap.  19), 
is  a  desk  calculator.  They  are  both  stored  program  computers. 
Programma  is  designed  for  simpler  accounting  and  statistical- 
tabulation  tasks  and  has  fixed-point  decimal  data.  (Programma 
101  costs  somewhat  less.)  The  HP9100A  operates  on  both  fixed- 
and  floating-point  decimal  data  with  scalar,  rectangular,  and 
polar  coordinate  vectors  and  is  designed  for  engineering  and 
scientific  calculations.  Thus,  according  to  a  measure  based 
on  data  types  and  operators,  the  HP  9100A  is  about  the  most 
complete  computer  in  the  book.  Its  operations  are  given  in  the 
PMS  diagram  of  Fig.  1. 


Fig.  1.  Hewlett-Packard  Model  9100A  Computing  Calculator  PMS 
diagram. 


Mp(read, write;  core;  368  w;  6  b/w) 

—  T. consol e (keyboard)*- 

—  T. console (CRT;  display;  numeric;  decimal;  mixed,  floating)-) 
Pc  data : (sea tar ,  rectangular  co-ordinate  vector,  polar  co- 
ordinate vector);  fixed,  floating;  decimal;  operations :  (-*-, 

.  -1  -1  -1  . 

-,  X,  /,  cos,  sin,  tan,  sin     ,  cos     ,  tan     ,  sinh,  cosh, 

tanh,  sinh       cosh        tanh        In,   log,o,  abs ,  e,  sqrt, 
integer  part , [rectangular  co-ordinate  vector}  «-  [polar  co- 
ordinate vector],  [polar  co-ordinate  vector]  [rectangular 
co-ordinate  vector]) 

—  T .  n  ume  r  ( r  i  n  t  e  r-* 
—T. plotter-. 

—  L. external  device  - 

2  programs;  196  program^teps/program; 
tep 


■T  Mfmagnetic  card; 

[6  b/program,^t 

'Pc  :=  fMp(read  only; 
[p  .mi  croprogra 


512  w,  f>k  b/w) 
amirted  (M .  processor  state('40  b)) 

^P.microprograinmed  ;=  P  .mi  croprogrammed 

Mp(control;   read  only;  800  ns/w; 
6')  w;  29  b/w) 


235 


236  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  4  |  Desk  calculator  computers:  keyboard  processors  with  small  memories 


The  implementation  has  approximately  36.2  kb  of  memory, 
including  the  read  only  and  read-v»/rite  parts.  The  design  is 
physically  outstanding,  and  its  use  of  microprogramming  is 
superb.  The  reader  should  note  there  are  two  levels  of  M(read 
only).  We  could  draw  the  PMS  structure  of  Pc  as  a  P. micro- 
programmed within  a  P. microprogrammed.  HP  rightfully  re- 
gards the  two  ISP's  (29-bit  and  64-bit  word)  as  proprietary  and 
carefully  avoids  discussing  these  points  in  the  article  (Chap.  20). 
It  might  be  noted  that  an  IBM  System/360  Model  30  requires 
about  2.9  milliseconds  for  a  floating-point  square  root,  whereas 
the  HP  9100A  requires  19  milliseconds.  By  way  of  evidence  of 
its  outstanding  packaging,  its  cost  is  about  five-eighths  that 
of  a  PDP-8/1  for  about  the  same  amount  of  physical  hardware. 
The  cost  difference,  though  truly  difficult  to  compare,  is  partially 
the  result  of  a  design  from  an  instrument  maker  (Hewlett- 
Packard)  versus  a  design  from  a  computer  manufacturer  (DEC). 
The  TV-like  construction  of  the  HP  9100A  is  an  important  les- 
son that  computer  manufacturers  have  not  learned.  In  other 


words,  a  Henry  Ford  has  yet  to  emerge  from  the  computer  field. 
(Our  guess  is  that  he  may  come  from  Japan.) 

Whereas  many  computers  in  this  book  are  included  because 
they  are  typical  of  points  in  the  computer  space,  the  HP  9100A 
is  included  because  it  is  innovative.  It  is  worthy  of  note  that 
only  one  of  the  engineers  had  some  computer  design  experi- 
ence; Cochran,  who  did  the  programming,  had  prior  experience 
with  circuitry  and  instrumentation.  Had  he  been  a  programmer 
by  training,  a  larger  Mp  might  have  been  required.  By  way  of 
comparative  evidence,  the  IBM  1800  floating-point  arithmetic 
functions  +,  — ,  x,  /,  sin,  cos,  tan^\  V^,  log,  exponential, 
tanh,  binary  to  decimal,  and  decimal  to  binary  require  approxi- 
mately 1,425  16-bit  words,  or  23  kb.  On  the  other  hand,  the 
FOCAL'  interactive  calculator  program  for  a  4,096-word  PDP-8 
(49  kb)  provides  the  user  with  all  but  polar-rectangular  coordi- 
nates and  hyperbolic  functions,  but  it  does  have  a  complete 
program  editing  capability,  text  handling,  control  structure,  and 
1,600-character  Mp. 


'Similar  in  scope  to  Dartmouth's  BASIC. 


Chapter  19 


The  OLIVETTI  Programma  101  desk 
calculator^ 


The  Programma  101  is  manufactured  by  the  Olivetti  Underwood 
Corporation.  The  cost  of  Programma  101  is  about  $'3,500  (in  1968). 
Several  thousand  are  currently  in  use.  Unlike  conventional 
stored  program  computers  it  has  instructions  which  can  be  exe- 
cuted directly  as  commands  from  a  keyboard  or  instructions  which 
can  be  stored  in  a  program  and  interpreted  by  the  processor.  The 
processor  uses  the  decimal  representation  for  mixed  numbers.  The 
decimal  point  location  is  controlled  manually.  Although  informa- 
tion is  stored  in  character  strings,  the  maximum  length  is  22  digits 
or  24  instnictions  for  a  register.  .\  program  can  be  up  to  120 
characters  long  and  is  stored  as  a  continuous  string.  The  internal 
encoding  of  a  character  is  8  bits.  There  are  no  absolute  addresses 
for  instructions,  and  jump  instructions  are  programmed  by  placing 
labels  or  references  in  the  string  to  transfer  to.  The  Programma  101 
is  composed  of  the  following  elements. 

Memory.  The  memory  stores  numeric  data  and  program  instruc- 
tions. 

Keyboard.  The  keyboard  has  four  fimctions:  It  is  used  for  operator 
control  of  the  calculator  (power  on,  off,  etc);  in  manual  mode  the 
instructions  are  executed  immediately  as  in  a  conventional  desk 
calculator  (e.g.,  add);  the  keys  write  a  program's  instructions  in 
the  memory,  and  the  instructions  are  executed  when  the  program 
is  run;  and  numeric  data  may  be  entered  to  a  nmning  program. 

Printing  unit.  Serial  printing  is  from  right  to  left,  at  30  characters 
per  second;  this  unit  prints  all  keyboard  entries,  programmed 
output,  and  instructions. 

Magnetic-card  reader  recorder.  This  device  permits  instructions 
and  constants  for  a  program  to  be  stored  and  retrieved  from 
magnetic  cards. 

Control  and  arithmetic  units.  The  control  unit  is  the  administrative 
section  of  the  computer.  It  receives  the  incoming  information, 
determines  the  computation  to  be  performed,  and  directs  the 

'The  description  is  partially  taken  horn  the  Programma  101  Programming 
Manual. 


arithmetic  unit  where  to  find  the  information  and  what  operation 
to  perform. 

The  PMS  diagram  shown  below  is,  of  course,  very  simple.  It 
conforms  closely  to  the  classic  diagram  of  what  a  digital  computer 
looks  like; 

Mp — Pc-pT — M.magnetic_card — 
■T.printer  — » 
■T.kevboard  «— 


Primary  memory  and  processor  memory 

The  memory  has  10  registers;  eight  are  for  general  storage  and 
two  are  used  exclusively  for  instnictions.  A  character  can  have 
several  meanings,  depending  on  the  register  and  its  use. 

The  two  instniction  registers,  1  and  2,  each  store  24  instruc- 
tions. An  instruction  is  one  character  long. 

The  eight  storage  registers,  M,  A,  R,  B,  C,  D,  E,  and  F,  have 
a  capacity  of  22  decimal  digits,  plus  decimal  point  and  sign.  The 
sign  and  decimal  point  do  not  require  character  space.  Alterna- 
tively, D,  E,  and  F  hold  24  instructions.  M.  .\,  and  R  are  operating 
registers  and  take  part  in  all  arithmetic  operations.  They  are 
considered  to  be  the  arithmetic  unit. 

The  M  register  is  the  Median  (or  distributive)  register.  All 
keyboard  figure  entries  are  held  in  the  M  register  and  distributed 
to  the  other  registers  as  instructed. 

The  .\  register  fimctions  with  the  arithmetic  unit  to  form  the 
.Accumulator,  .\rithmetic  results  are  developed  and  retained  in  the 
.\  register.  .\  result  of  up  to  2.3  digits  can  be  produced  in  the  .\ 
register. 

The  R  register  retains  the  complete  results  in  addition  and 
subtraction,  the  complete  product  in  multiplication,  the  remainder 
in  division,  and  a  remainder  in  square  root.  B,  C.  D,  E,  and  F 
are  storage  registers.  Each  can  be  split  into  two  registers,  each 
with  a  capacity  of  11  digits,  plus  decimal  point  and  sign.  When 
storage  registers  are  split,  the  right  portion  of  the  split  register 
retains  its  original  designation,  and  the  left  side  is  identified  with 
the  corresponding  lowercase  letter.  Thus  these  registers  become 


238  Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


Section  4  |  Desk  calculator  computers:  keyboard  processors  with  small  memories 


b,  B,  c,  C,  d,  D,  e,  F,  f  and  F.  The  lowercase  designation  is 
obtained  bv  first  entering  the  corresponding  uppercase  letter  and 
then  depressing  the  "/"  key,  for  example,  c  =  C/. 

The  registers  D,  E,  and  F  or  their  splits  have  the  additional 
capability  of  storing  either  instructions  or  constants  to  be  used 
within  programs.  Thus  they  can  store  1  signed  22-digit  number, 
2  signed  11-digit  numbers.  1  signed  11-digit  number,  and  11 
instructions,  or  24  instructions.  Programs  of  up  to  120  instructions 
can  be  stored  internally  (Fig.  1).  When  registers  D,  E,  and  F  and 
their  splits  are  not  used  for  instnictions,  they  are  free  to  store 
constants  or  intermediate  results. 

The  relationship  of  memory,  keyboard,  printer,  and  magnetic 
card  is  shown  in  Fig.  1.  Registers  are  referenced  explicitly.  Pro- 
grams do  not  use  explicit  addresses  in  instruction.  Thus,  special 
marker  characters  are  placed  in  the  instructions  to  serve  as  jump 
reference  addresses  (program  labels). 


Fig.  2.  Programma  101.  (Courtesy  of  Olivetti  Underwood  Corporation.) 


Keys  1    2    3  Keys 


0    •  - 


Fig.  1.  Programma  101  functional  block  diagram.  (Courtesy  of  Oli- 
vetti Underwood  Corporation.) 


Structure 

The  calculator  parts  are  described  brieflv  below.  The  parts  corre- 
spond to  both  the  numbers  (Fig.  2)  and  the  lettered  keyboard  (Fig. 
.3).  The  following  parts  are,  in  effect,  the  console.  Some  of  the  keys 
are  used  for  control  of  the  calculator,  and  some  can  be  used  either 
as  programmed  instructions  or  as  commands  which  are  executed 
directly.  The  following  section  discusses  their  instruction  function. 

The  on-off  key  (1).  This  is  a  dual-purpose  switch  for  both  the 
on  and  off  positions.  (Note:  The  OFF  position  automatically  clears 
all  stored  data  and  instructions.) 

The  error  (red)  liglii  (2).  This  lights  when  the  computer  is  turned 
on  and  whenever  the  computer  detects  an  operational  error,  e.g., 
exceeding  capacity,  division  by  zero. 

The  general  reset  key  (3).  This  key  erases  all  data  and  instruc- 
tions from  the  computer  and  turns  off  the  error  light. 

The  correct-performance  (green)  light  (4).  This  light  indicates 
the  computer  is  functioning  properly.  A  steadv  light  indicates  that 
the  computer  is  ready  for  an  operator  decision;  a  flickering  light 
indicates  that  the  computer  is  executing  programmed  instructions 
and  that  the  kevboard  is  locked. 

The  decimal  wheel  (.5).  This  determines  the  number  of  decimal 
places  (0,  I,  .  .  .  ,  15)  to  which  computations  will  be  carried  out 
in  the  A  register  and  the  decimal  places  in  the  printed  output, 
except  for  results  from  the  R  register.  Up  to  22  decimal  digits  may 
be  developed  in,  and  printed  from,  the  R  register. 


Chapter  19  |  The  OLIVETTI  Programma  101  desk  calculator 


Fig.  3.  Programma  101  keyboard.  (Courtesy  of  Olivetti  Underwood 
Corporation.) 

The  record  program  switch  (6).  When  this  switch  is  off,  the 
commands  pressed  on  the  keyboard  are  executed  directly.  When 
this  switch  is  on,  it  directs  the  computer  to  store  instructions  either 
in  the  memory  from  the  keyboard  or  onto  a  magnetic  program 
card  from  the  memory. 

The  record  program  switch  must  be  off  to  load  instructions  from 
a  magnetic  program  card  into  the  memory. 

The  print  program  stcitcli  (7).  When  this  switch  is  on  (in),  it 
directs  the  computer  to  print  out  the  instructions  stored  in  memory 
from  its  present  location  in  the  program  to  the  next  Stop  instruc- 
tion (S),  whenever  the  print  key  (20)  is  depressed. 

Tlie  magnetic  program  card  (8).  This  is  a  plastic  card  with  a 
ferrous  oxide  backing,  used  to  record  programs  for  external  storage. 
The  card  is  inserted  into  a  magnetic  reader/writer  (9)  to  record 
instructions  and/or  constants  into  or  from  the  computer  memory. 
Once  inserted,  the  card  mav  be  remoyed  from  the  computer  (10) 
without  disturbing  the  stored  instructions. 

(Note:  The  magnetic-card  reader/writer  uses  only  half  the 


magnetic  card  at  a  time;  consequently,  two  sets  of  120  instnictions 
and  or  constants  may  be  stored  on  a  single  card.) 

The  keyboard  release  key  (11).  This  key  reactivates  a  locked 
keyboard.  If  two  or  more  keys  are  depressed  simultaneously,  the 
keyboard  will  lock  to  indicate  a  misoperation.  Because  the  opera- 
tor does  not  know  what  entry  was  accepted  by  the  computer,  after 
touching  the  keyboard  release  key,  the  clear  entry  key  (16)  must 
be  depressed  and  the  complete  figure  reentered. 

Tape  advame  (12).  This  advances  the  printing  paper  tape. 

Tape  release  lever  (13).  This  enables  adjustment  when  changing 
tape  rolls. 

The  routine  selection  {keys  V.  \V,  Y,  and  Z).  These  kevs  direct 
the  computer  to  the  proper  program  or  subroutine. 

The  numeric  keyboard  (keys  0,  I,  .  .  .  ,  9,  .  ,  — ).  This  keyboard 
allows  entry  of  a  signed,  mixed  decimal  number.  Keyboard  entries 
are  automatically  stored  in  the  M  register. 

The  clear  entry  key.  This  key  clears  the  entire  keyboard  entry. 
When  keying  in  the  program,  a  depression  of  the  clear  key  will 
erase  the  last  instruction  that  has  been  entered  into  the  memory. 
The  printing  tape  will  be  spaced. 

The  start  key  (S).  This  key  restarts  the  computer  in  programmed 
operation;  it  is  used  to  code  a  stop  instruction  when  keying  in 
programs. 

Tlie  register  address  (kevs  .V,  B,  C,  D,  E,  F,  and  R).  These  keys 
identify  the  corresponding  registers.  The  operating  register  M  has 
no  keyboard  identification  since  the  computer  automatically  re- 
lates all  instnictions  to  the  M  register  unless  otherwise  instructed. 

The  split  key  (  ).  This  key  combined  with  a  register  (for  exam- 
ple, C  )  divides  that  register  into  two  equal  parts.  When  storage 
registers  are  split,  the  right  portion  of  the  split  register  retains 
the  original  designation,  and  the  left  side  is  identified  on  the  tape 
with  the  corresponding  lowercase  letter  (for  example,  C/  =  c). 

The  print  key  i  ,  )■  This  key  prints  the  contents  of  an  addressed 
register. 

The  clear  key  (°).  This  key  clears  the  contents  of  an  addressed 
register.  When  the  computer  is  operated  manually,  a  depression 
of  this  key  will  print  the  number  in  the  register  and  clear  it. 

The  transfer  keys  (j,,  |,  J).  These  keys  perform  transfer  opera- 
tions between  the  storage  registers  and  the  operating  registers. 

The  arithmetic  keys  (  —  .  +,  X,  V^).  These  keys  perform 
their  indicated  arithmetic  fimction. 

Keyboard  and  stored-program  operations 

All  the  following  kevs  can  be  used  as  direct  instnictions  (i.e., 
manually)  if  the  record  program  switch  is  off.  .\lternatively.  if  the 


Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  4     Desk  calculator  computers:  keyboard  processors  with  small  memories 


record  program  switch  is  on,  the  keys  specify  the  instruction  to 
be  recorded  in  the  program  memory.  Finally,  the  descriptions 
specify  the  instruction's  behavior  as  it  is  executed  within  a  pro- 
gram. 

Start  S.  The  instruction  S  (used  in  creating  a  program)  directs 
the  computer  to  stop  and  release  the  keyboard  for  the  entry  of 
figures  or  the  selection  of  a  subroutine.  After  figure  entry,  the 
program  is  restarted  by  touching  the  start  key  (S). 

The  program  can  also  be  restarted  by  touching  a  routine  selec- 
tion key.  When  the  S  instniction  stops  the  program,  the  computer 
may  also  be  operated  in  the  manual  mode  without  disturbing  the 
program  instructions  in  the  memory.  Any  figures  entered  on  the 
keyboard  before  depression  of  start  or  an  operation  key  will  be 
printed  automatically. 

Clear  °.  The  clear  operation  °  directs  the  computer  to  clear 
the  selected  register.  The  M  and  R  registers  cannot  be  cleared 
with  this  instruction. 

When  the  computer  is  operated  manually  this  key  will  cause 
it  to  print  the  contents  of  the  selected  register,  r.  (r  <—  0) 

Data-transfer  operations 

To  A  I.  An  instruction  containing  the  operation  J,  directs  the 
computer  to  transfer  contents  of  the  addressed  register,  r,  to  A 
while  retaining  them  in  the  original  register.  The  contents  of  M 
and  R  are  not  affected.  The  previous  contents  of  A  are  destroyed. 

(A^r) 

From  M  |.  An  instniction  containing  the  operation  f  directs 
the  computer  to  transfer  the  contents  of  M  to  the  addressed  regis- 
ter while  retaining  them  in  M.  The  contents  of  registers  A  and 
R  are  unaffected  by  this  instruction.  The  original  contents  of  the 
addressed  register  are  destroyed,  (r  <—  M) 

Exchange  |.  An  instruction  containing  the  operation  |  directs 
the  computer  to  exchange  the  contents  of  the  A  register  with  the 
contents  of  the  addressed  register.  The  contents  of  M  are  not 
affected  except  by  the  exchange  between  A  and  M.  The  contents 
of  the  R  register  are  not  affected.  (A  <—  r;  r  <—  A) 

D-R  exchange  RS.  The  instruction  RS  directs  the  computer  to 
exchange  the  contents  of  D  (both  D  and  d  registers)  with  the 
contents  of  the  R  register.  (D  <—  R;  R  <—  D) 

This  instruction  has  a  special  use  in  multicard  programs  to  store 
temporarily  the  contents  of  the  D  (d,D)  register  in  R,  when  a  new 
card  has  to  be  read  to  continue  the  program.  During  this  tem- 
porary storage  no  instruction  affecting  the  R  register  should  be 
executed. 

Decimal  part  to  M  /\.  The  instruction  /\  directs  the  computer 
to  transfer  the  decimal  portion  of  the  contents  of  A  to  the  M 


register  while  retaining  the  entire  contents  in  A.  The  original 
contents  of  the  M  register  are  destroyed.  The  R  register  is  not 
affected  by  this  instruction.  (M  <—  fraction„part(A)) 

Arithmetic  operations 

All  arithmetic  operations  are  performed  in  the  operating  registers 
M,  A,  and  R.  An  arithmetic  operation  is  performed  in  two  phases: 

1  The  contents  of  the  selected  register  are  automatically 
transferred  to  the  M  register.  The  M  register  is  selected 
automatically  if  no  other  register  is  indicated. 

2  The  operation  is  carried  out  in  the  M,  A,  and  R  registers. 

Programma  101  can  perform  these  arithmetic  operations:  -|-, 
— ,  X,  ^/~,  and  absolute  value.  Figures  are  accepted  and 
computed  algebraically.  A  negative  value  is  entered  by  depressing 
the  negative  key  at  any  time  during  the  entry  of  a  figure.  If  there 
is  no  negative  indication,  the  computer  will  accept  the  figure  as 
positive. 

The  subtract  operation  key  is  separate  from  the  numeric  key- 
board and  is  used  exclusively  for  subtraction  (not  negation). 

Addition  + .  An  instruction  containing  the  operation  +  directs 
the  computer  to  add  the  contents  of  the  selected  register  (addend) 
to  the  contents  of  the  A  register  (augend).  Addition  is  executed 
in  two  phases: 

1  Transfer  the  contents  of  the  selected  register  (addend) 
to  M. 

2  Add  the  contents  of  M  to  the  contents  of  A  (augend)  ob- 
taining in  A  the  sum  truncated  according  to  the  setting  of 
the  decimal  wheel.  The  complete  sum  is  in  R.  M  contains 
the  addend.  (M  «—  r;  next  R  <—  A  -I-  M;  next  A  <—  f(R,deci- 
mal_  wheel)) 

Multiplication  X .  An  instruction  containing  the  operation  X 
directs  the  computer  to  multiply  the  contents  of  the  selected 
register  (multiplicand)  by  the  contents  of  the  A  register  (multi- 
plier). 

1  Transfer  the  contents  of  the  addressed  register  to  M. 

2  Multiply  the  contents  of  M  by  the  contents  of  A,  obtaining 
in  A  the  product  truncated  according  to  the  setting  of  the 
decimal  wheel.  The  complete  product  is  in  R.  M  contains 
the  multiplicand.  (M  r;  next  R  <—  A  X  M;  next  A  «—  f(R, 
decimal^  wheel)) 


Chapter  19  j  The  OLIVETTI  Programma  101  desk  calculator  241 


Subtraction  —.  An  instruction  containing  the  operation  — 
directs  the  computer  to  subtract  the  contents  of  the  selected 
register  (subtrahend)  from  the  contents  of  the  A  register  (minuend). 

1  Transfer  the  contents  of  the  selected  register  (subtrahend) 
to  M. 

2  Subtract  the  contents  of  M  from  the  contents  of  A  I  minu- 
end), obtaining  in  A  the  difference  tnuicated  according  to 
the  setting  of  the  decimal  wheel.  The  complete  difference  is 
in  R.  M  contains  the  subtrahend.  (M  «—  r;  next  R  «—  A  —  M; 
next  A  <—  f(R,decimal„ wheel)) 

Division  .  .\n  instruction  containing  the  operation  -i-  directs 
the  computer  to  divide  the  contents  of  the  selected  register 
(divisor)  into  the  contents  of  the  A  register  (dividend). 

1  Transfer  the  contents  of  the  addressed  register  to  M. 

2  Divide  the  contents  of  M  into  the  contents  of  .V.  obtaining 
in  A  the  quotient  tnuicated  according  to  the  setting  of  the 
decimal  wheel.  The  decimally  correct  fractional  remainder 
is  in  R.  M  contains  the  divisor.  (M  «— r;  next  A*— A  ^  M; 
R  ^  A  mod  M) 

Square  Root  \A.  An  instruction  containing  the  operation 
directs  the  computer  to: 

1  Transfer  the  contents  of  the  selected  register  to  M. 

2  Extract  the  square  root  of  the  contents  of  M.  as  an  absolute 
value,  obtaining  in  A  the  result  tnmcated  according  to  the 
setting  of  the  decimal  wheel.  The  R  register  contains 
a  nonfunctional  remainder.  At  the  end  of  the  operation, 
M  contains  double  the  square  root.  (M  <—  r;  next 
M,R  ^  sqrt(abs(M))  X  2;  next  A  <— f(M/2,  decimal„wheel)) 

Absolute  Value  \l.  The  absolute-value  instniction  .\l  changes 
the  contents  of  the  A  register,  if  negative,  to  positive.  (A  *—  abs(A) 

Jump  operations 

The  jump  operation  directs  the  computer  to  depart  from  the 
normal  sequence  of  step-bv-step  instructions  and  jump  to  a  pre- 
selected point  in  the  program. 

These  instructions  provide  both  internal  and  external  (manual) 
decision  capability  and  are  useful  to  create  "loops  '  that  allow 
repetitive  sequences  in  a  program  to  be  executed;  routines  or 
subroutines  to  be  performed  at  the  discretion  of  the  operator; 
and  automatically  to  '"branch"  to  alternate  routines  or  subroutines 
according  to  the  value  in  the  A  register. 


The  jump  process  consists  of  two  related  instructions  or  char- 
acters: 

1  The  reference  point  or  label.  1,  is  where  the  program  begins 
or  where  the  jump  is  to  start.  The  sequence  is  restarted  at 
this  point.  This  label  has  no  effect  when  interpreted. 

2  The  jump  instruction  specifies  the  label  for  the  instruction 
sequence. 

There  are  two  types  of  jump  instructions:  imconditional  jumps 
and  conditional  jumps. 

l'nconditii>nal  jumps.  These  jumps  are  executed  whenever  the 
instruction  is  read.  The  labels  or  reference  points  for  unconditional 
jumps,  L,  and  the  corresponding  jump  instructions,  j,  are  given 
as  (L,j).  The  permissible  jump  labels  and  jump  constructions  are: 

(.\V,V),  (.W,\V),  (AY,Y),  (AZ.Z),  (BV,CV), .... 
(BZ,CZ),  (EV.DV)  (EZ,DZ).  (F\',RV)  (FZ,RZ) 

.Ml  programs  nuist  l)egin  with  reference  parts  of  an  uncondi- 
tional jump  instruction.  Reference  points  .\V,  .\\V,  .A,Y,  ,\Z  are 
used  so  that  these  program  setjuences  can  be  started  by  touching 
the  routine  selection  keys  \',  W.  V,  or  Z. 

Conilitional  Jumps.  If  the  contents  of  the  .\  register  are: 

(Ireater  than  zero:  the  program  jumps  to  the  corresponding 
reference  point  (label). 

Zero  or  less:  the  program  continues  with  the  next  in- 

struction in  sequence. 

The  labels  or  reference  points  for  conditional  jumps,  L,  and 
the  corresponding  conditional  jump  instruction,  cj,  are  given  as 
(L,cj).  The  permissible  jump  labels  and  jump  instructions  are 

(aV,/V),  ....  (aZ,/Z),  (bV,cV),  .  .  .  , 
(bZ.cZ),  (eV,dV),  .  .  .  ,  (eZ,dZ).  J\'.rV), 
.  .  .  ,  (fZ.rZ) 

Constants  as  instructions  .\./\.  A  one-digit  constant  can  be  gener- 
ated by  a  special  instruction.  The  results  of  the  instruction  place 
the  digit  in  M.  The  digit  value  of  the  constant  must  follow  A  f . 

Instructions  and  data  in  the  same  register.  An  instruction  can  be 
considered  to  be  data  and,  therefore,  used  as  both  a  constant  and 
an  instniction.  Another  technique  allows  the  computer  to  interpret 


242  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  4  j  Desk  calculator  computers:  keyboard  processors  with  small  memories 


data  as  null  instructions  so  that  both  data  (for  reading  and  writing) 
and  instructions  can  be  stored  in  the  same  register. 

Examples.  A  program  to  take  values  for  the  nimibers  A,  B,  C,  and 
D  from  the  keyboard  and  then  print  the  value  of  the  expression 
[(A  +  B)  X  C]/D  would  be  written  as  follows: 


instruction  comments 

 *AV  label  to  allow  the  program  to  be  started  by  key,  V 

S  wait;  enter  A  from  keyboard  into  M 

J,  or  |,M'  A  value  goes  to  A  register 

S  wait,  enter  B  from  keyboard 

+  M  a  register  contains  A  +  B 

S  wait,  enter  C  from  keyboard 

X  M  a  register  x  C  or  (A  +  B)  x  C 

S  wait,  enter  D  from  keyboard 

M  a  register  has  expression 

A<0>  print  A  register 

'  V  jump  back  to  beginning  label  to  recalculate  ex- 
pression for  new  variables 


'  M  is  implied  if  left  blank. 

The  following  program  computes  and  prints  n!.  n  is  entered 
from  the  keyboard,  where  n  >  1,  and  an  integer.  The  program  is 
started  by  pressing  key  Z. 


instruction  comments 

 >  AZ  program  start,  label 

S  stop,  enter  n  from  keyboard  into  M 

D|  D  <—  n;  D  holds  n!  or  n  X  (n  —  1)  X 

M  j,  (or  [)  A  <—  n;  A  holds  n,  n  —  1,  n  —  1,  .  .  .  ,  1 

I  >  AW  label 

A/|  generate  1  in  M 
1 

M-  (or  -)  A^A  -  1;  (n^n  -  1) 

I—  /V  test  if  n  >  0 

D<^  print  result 

Z  get  next  n  from  keyboard 

' — »aV  begin  to  update  n!,  label 

DJ  A  holds  n!;  D  holds  n  —  1  after  execution 

Dx  A  holds  n  X  (n  —  1)  X 

T>\  D  holds  n!;  A  holds  n  —  1  after  execution 
 W  return  to  compute  n  —  2 


Conclusion 

Many  algorithms  have  been  written  for  Programma  101,  being 
coded  in  impressively  small  space.  The  techniques  have  sometimes 
been  borrowed  from  conventional  computer  programming.  For 
example,  multiple  card  programs  operate  by  using  chains  in  the 
same  way  as  large  FORTRAN  programs.  The  significant  fact  to 
the  reader  is  that  the  Programma  101  calculator  is  a  nicely  de- 
signed stored  program  computer. 


Chapter  20 


The  HP  Model  9100A  computing 
calculator^ 

Richard  E.  Monnier  /  Thomas  E.  Osborne  / 
David  S.  Cochran 


A  new  electronic  calculator  with  computerlike  capabilities 

Many  of  the  day-to-day  computing  problems  faced  by  scientists 
and  engineers  require  complex  calculations  but  involve  only  a 
moderate  amount  of  data.  Therefore,  a  machine  that  is  more  than 
a  calculator  in  capability  but  less  than  a  computer  in  cost  has  a 
great  deal  to  offer.  At  the  same  time  it  must  be  easy  to  operate 
and  program  so  that  a  minimum  amount  of  effort  is  required  in 
the  solution  of  typical  problems.  Reasonable  speed  is  necessary 
so  that  the  response  to  individual  operations  seems  nearly  instan- 
taneous. 

The  HP  Model  9100A  Calculator,  Fig.  1,  was  developed  to  fill 
this  gap  between  desk  calculators  and  computers.  Easy  interaction 
between  the  machine  and  user  was  one  of  the  most  important 
design  considerations  during  its  development  and  was  the  prime 
guide  in  making  many  design  decisions. 

CRT  display 

One  of  the  first  and  most  basic  problems  to  be  resolved  concerned 
the  type  of  output  to  be  used.  Most  people  want  a  printed  record, 
but  printers  are  generally  slow  and  noisy.  Whatever  method  is 
used,  if  only  one  register  is  displayed,  it  is  difficult  to  follow  what 
is  happening  during  a  sequence  of  calculations  where  numbers  are 
moved  from  one  register  to  another.  It  was  therefore  decided  that 
a  cathode-ray  tube  displaying  the  contents  of  three  registers  would 
provide  the  greatest  fle.xlbilitv  and  would  allow  the  user  to  follow 
problem  solutions  easily.  The  ideal  situation  is  to  have  both  a  CRT 
showing  more  than  one  register,  and  a  printer  which  can  be  at- 
tached as  an  accessory. 

Figure  2  is  a  typical  display  showing  three  numbers.  The  X 
register  displays  numbers  as  they  are  entered  from  the  ke\board 
one  digit  at  a  time  and  is  called  the  keyboard  register.  The  Y 
register  is  called  the  accumulator  since  the  results  of  arithmetic 

'This  chapter  is  a  compilation  of  three  articles  [Monnier,  1968;  Osborne, 
1968;  Cochran,  1968],  reprinted  from  Hewlett-Packard  Journal  vol.  20, 
no.  1,  pp.  3-9,  10-13.  14-16,  September,  1968. 


operations  on  two  numbers,  one  in  X  and  one  in  V,  appear  in  the 
Y  register.  The  Z  register  is  a  particularly  convenient  register  to 
use  for  temporary  storage. 

Numbers 

One  of  the  most  important  features  of  the  Model  91()().\  is  the 
tremendous  range  of  numbers  it  can  handle  without  special  atten- 
tion by  the  operator.  It  is  not  necessary  to  worr\'  about  where 
to  place  the  decimal  point  to  obtain  the  desired  accuracy  or  to 
avoid  register  overflow.  This  flexibility  is  obtained  because  all 
numbers  are  stored  in  floating  point'  and  all  operations  performed 
using  'floating  point  arithmetic'  .\  floating  point  number  is  ex- 
pressed with  the  decimal  point  following  the  first  digit  and  an 
exponent  representing  the  number  of  places  the  decimal  point 
should  be  moved — to  the  right  if  the  exponent  is  positive,  or  to 
the  left  if  the  exponent  is  negative. 


Fig.  1.  Tfiis  new  HP  Model  9100A  calculator  is  self-contained  and  Is 
capable  of  performing  functions  previously  possible  only  with  larger 
computers. 


244   Part  3  j  The  instruction-set  processor  level:  variations  in  the  processor 


Section  4     Desk  calculator  computers:  keyboard  processors  with  small  memories 


Fig.  2.  Display  In  fixed  point  with  the  decimal  wheel  set  at  5.  The  Y 
register  has  reverted  to  floating  point  because  the  number  is  too  large 
to  be  properly  displayed  unless  the  digits  called  for  by  the  DECIMAL- 
DIGITS  setting  are  reduced. 

4.398  364  291  x  lO-^  =  .004  398  364  291 

The  operator  may  choose  to  display  numbers  in  FLOATING 
POINT  or  in  FIXED  POINT.  The  FLOATING  POINT  mode 
allows  numbers,  either  positive  or  negative,  from  1  X  10"-*  to 
9.999  999  999  x  10^"  to  be  displayed  just  as  they  are  stored  in  the 
machine. 

The  FIXED  POINT  mode  displays  numbers  in  the  way  they 
are  most  commonly  written.  The  DECIMAL  DIGITS  wheel  allows 
setting  the  number  of  digits  displayed  to  the  right  of  the  decimal 
point  anywhere  from  0  to  9.  Figure  2  shows  a  display  of  three 
numbers  with  the  DECIMAL  DIGITS  wheel  set  at  .5.  The  number 
in  the  Y  register,  .5.3.36  845  815  x  lO^  =  5.33  684.5815,  is  too  big 
to  be  displayed  in  FIXED  POINT  without  reducing  the  DECI- 
MAL DIGITS  setting  to  4  or  less.  If  the  number  is  too  big  for 
the  DECIMAL  DIGITS  setting,  the  register  involved  reverts 
automatically  to  floating  point  to  avoid  an  apparent  overflow.  In 
FIXED  POINT  display,  the  number  displayed  is  rounded,  but  full 
significance  is  retained  in  storage  for  calculations. 

To  improve  readability,  O's  before  the  displayed  number  and 
un-entered  O's  following  the  number  are  blanked.  In  FLOATING 
POINT,  digits  to  the  right  of  the  decimal  are  grouped  in  threes. 

Pull-out  instruction  card 

A  pull-out  instruction  card.  Fig.  3,  is  located  at  the  front  of  the 
calculator  under  the  keyboard.  The  operation  of  each  kev  is  briefly 


explained  and  key  codes  are  listed.  Some  simple  examples  are 
provided  to  assist  those  using  the  machine  for  the  first  time  or 
to  refresh  the  memory  of  an  infrequent  user.  Most  questions  re- 
garding the  operation  of  the  Model  9I00A  are  answered  on  the 
card. 

Data  entry 

The  calculator  keyboard  is  .shown  in  Fig.  4.  Numbers  can  be 
entered  into  the  X  register  using  the  digit  keys,  the  it  key  or  the 
ENTER  EXP  key.  The  ENTER  EXP  key  allows  powers  of  10  to 
be  entered  directly  which  is  useful  for  very  large  or  very  small 
numbers.  6.02  X  lO-^  is  entered  Q  ©  ©  0  ® '  *^ 
ENTER  EXP  key  is  the  first  key  of  a  number  entry,  a  1  is  auto- 


a  display  after  30  i 


i  k*ytioard  entiy  Ramovfta 
AKi;  and  HVPCR  CMiditiona.  Nol 
naaOBd  tof  naw  aolry 
Clears  display  and  wcuinuWe  r«||- 
,  liters  (0-*X.  Y.  Z. » .fl-  Removes 
ARC  aitd  HYPER  condilkxis  anU 
clears  the  flag. 

Er^tars  decimal  Ktn\.  Not  naetled 
tor  entry  ot  integers. 
Causes  rwrt  diMlts  and  cJiange  sicn 
*  anlries  Id  eltect  tJie  exponent  of  X, 
Overrides  decimal  c»in1- 
Chancas  sign  of  numliei  m  y 
Clianaes  sign  o  f  ejiponent  1 1 
ENTER  EXP  *as  pressttd. 


STORAGEi  Storage 


:  uncfianged 


REOtU; 

ALPHA  REGISTERS  (»(IY: 

>JbxJS.a.  and  ^  tecall  cootants  ot 
alpfia  registers  to  X.  Satire*  t«gls- 
tar  is  unGhsngad. 
AlPMA  OR  NUtttERtC  REtSISTEHS- 
Eicttangns  j  wHth  register  lndir:*tad 
y~'  by  nest  Xeystrdfce:  ortly  instruction 
for  recalling  contents  of  a  numeric 


From  Iwyboani'.  SCT^  iWOBt"jdatyw 
keys  tn  program  s«Qtienc« 
Fmm  m*sn«tK  card:  in^grt  c»rd. 

TO  RCCORO  A  PROGRAM  _ 

SET:  as  P»ES5;!^nl(g;g) 
fwrt  mwwllc  card.  PRESS;  D^wro, 
TO  RUM  A  PROCRAM 

Ent«T  dUe  and  press  COffTINUE  m 

-TO  OUH&E  A  PROORAM  S£EP 

st:T:rrs^  pREssQf^^yg) 

SET:i«^rj    PRESS;  DMtr«]  My 


ENTER  A  NUMBER:  ,  ■  .'  i  0 
KKybo«rd  ontnei  dispiaved  m 
Any  ooecstlon  t«rmIoat»s  ei 


I  Mts  numlwf  of  pIscM  (o 
ght  ot  dftclmal  (hem  4).  RigMnwst 
ritt  roiH>d«d   Rngiiteri  with  lefl  ow- 


R«p(»c«  T  wuh  th«  indicated  lotwrtio 

LK4«?y»>«««Junrts      trig  lunt 
Pretl.  (Of  in- 


PreKx  lor 
hyparbohc 
tunctioni. 


a  +  ACCUMULATE  + 

t  +  f*t      ,  and  r  unchanged 

*  *  I-*'        ACCOMUUMT  - 

f  —  X  and  J-  uncnsngMl 

#-^y  KCAU 

JJartd/utKtiangWf 

,  CauMC  WKondilionaf  branch  1 
iKldrcii  grvan  by  nwt  tMO  progmm 
itsiM  or  MytMMm  »ntftts. 
CONOriKm  HET 
'  IBranchas  M  address  givan  by 
jrWKt  IWQ  progism  staia  )f  tl/M 
israp  is  alphameflc  (GO  TO  not 
'  jnacMMry.)  Ottwtwita,  awcutat 
[inttnictions  in  nsvl  two  steps  and 
!Contlnu«3  wilh  third  step 

CONDITION  NOT  MET 
,SWps  fwxt  two  program  slaps, 
s Continue*  with  ttiJrd  pfoaram 
■  stap  (IF  aAG  ctaars  tha  flag.) 
Sets  a  condition  to  b»  tntad  t>y 
.  tt»  nairt  IF  FLAG.  May  ba  tu«d 
manually  ot  as  a  program  rttp. 
Forcas  a  brief  dtsptay  during  pm^ 
'  gram  axacuUon.  Whan  twtd  down, 
causas  STOP  at  next  prog  PAUSE. 


TAN^(-3)  =  -71.S65*  Anawar-*X 

SET:  i^fi^  i^iXDEiQ 


Stopt   program  aseculi 
'  u£«d  manually  or  as  a  program 
step. 

Ends  rotordmg  on  magnetic  card. 


Startfi  program  e 
»«nt  address.  May  ba  utad  at  a 

"No  Operation"  step. 

ln!wwflT~lmode-  Dtsptayi  addreea 

,  andlnstfuction  coda  In  X 
Add£s*5jr*  2?-  «  ♦-Coda 
tn,'  I  wtjw  imooa:  Eweutas  ona  pro- 
%mfrt^  or  all  3  steps  ot  00  TO 
O  { ).  A  mat  "IF'  step  branctwa  to 
addrMS  m  next  two  steps.  H  not 
an  address,  esecules  t'rst  step 
only.  A  not  mft  "IF"  step  brsnchea 
tttfrd  Map. 


Fig.  3.  Pull-out  instruction  card  is  permanently  attached  to  the  calcula- 
tor and  contains  key  codes  and  operating  instructions. 


Chapter  20  |  The  HP  Model  9100A  computing  calculator  245 


mm- 


•as 


888  SSSl 
888  ' 


Fig.  4.  Keys  are  in  four  groups  on  the  keyboard,  according  to  their 
function. 

niatically  entered  into  the  mantissa.  Thus  onlv  two  keystrokes 
@  (T)  suffice  to  enter  1.000,0(10.  The  CHG  SIGN  key  changes 
the  sign  of  either  the  mantissa  or  the  exponent  depending  upon 
which  one  is  presently  being  addressed.  Numbers  are  entered  in 
the  same  way,  regardless  of  whether  the  machine  is  in  FIXED 
POINT  or  FLO.\TING  POINT.  Any  key,  other  than  a  digit  key, 
decimal  point,  CHG  SIGN  or  ENTER  EXP,  terminates  an  entry; 
it  is  not  necessary  to  clear  before  entering  a  new  number.  CLE.\R 
.X  sets  the  X  register  to  0  and  can  be  used  when  a  mistake  has 
been  made  in  a  nimiber  entry. 

Control  and  arithmetic  keys 

ADD,  SUBTRACT,  MULTIPLY,  DIVIDE  involve  two  numbers, 
so  the  first  number  must  be  moved  from  .X  to  Y  l)efore  the  second 
is  entered  into  X.  .\fter  the  two  numbers  have  been  entered,  the 
appropriate  operation  can  be  performed.  In  the  case  of  a  DI\TDE. 
the  dividend  is  entered  into  Y  and  the  divisor  into  X.  Then  the 
(2  kev  is  pressed  causing  the  quotient  to  appear  in  Y,  leaving 

the  divisor  in  X. 

One  way  to  transfer  a  number  from  the  X  register  to  the  Y 
register  is  to  use  the  double  sized  key,  Q  ,  at  the  left  of  the  digit 
keys.  This  repeats  the  number  in  X  into  Y,  leaving  X  unchanged: 
the  number  in  Y  goes  to  Z,  and  the  number  in  Z  is  lost.  Thus, 
when  squaring  or  cubing  a  number,  it  is  only  necessary  to  follow 
(T^  with  (2  or  (2  [Tj-  The  (T^  key  repreats  a  number  in  Z 
to  Y  leaving  Z  unchanged,  the  number  in  Y'  goes  to  X,  and  the 
number  in  X  is  lost.  The  kev  rotates  the  number  in  the  X 
and  Y  registers  up  and  the  number  in  Z  down  into  X.  rotates 
the  numbers  in  Z  and  Y'  down  and  the  number  in  X  up  into  Z. 
^=^1  interchanges  the  numbers  in  X  and  Y.  Using  the  two  ROLL 
keys  and  numbers  can  be  placed  in  anv  order  in  the  three 
registers. 


Functions  available  from  the  keyboard 

The  group  of  keys  at  the  far  left  of  the  kevboard.  Fig.  4,  gives 
a  good  indication  of  the  power  of  the  Model  9100.\.  Most  of  the 
common  mathematical  functions  are  available  directly  from  the 
keyboard.  Except  for  (Hj  the  fimction  keys  operate  on  the  number 
in  .\  replacing  it  with  the  fimction  of  that  argument.  The  numbers 
in  Y  and  Z  are  left  unchanged.  Q  is  located  with  another  group 
of  keys  for  convenience  but  operates  the  same  wav. 

The  circular  functions  operate  with  angles  expressed  in  R.\DI- 
ANS  or  DEGREES  as  set  by  the  switch  above  the  keyboard.  The 
sine,  cosine,  or  tangent  of  an  angle  is  taken  with  a  single  keystroke. 
There  are  no  restrictions  on  direction,  quadrant  or  mnnber  of 
revolutions  of  the  angle.  The  inverse  fimctions  are  obtained  by 
using  the  [Y]  key  as  a  prefix.  For  instance,  two  key  depressions 
are  necessary  to  obtain  the  arc  sin  .t:  (uTT] .  The  angle  obtained 
will  be  the  standard  principal  value.  In  radians: 


-  ^  <  Sin-'  x<^ 


0  <  Cos-'  .r  < 


-^<Tan-'.r<|- 


The  hyperbolic  sine,  cosine,  or  tangent  is  obtained  using  the 
as  a  prefix.  The  inverse  hyberbolic  functions  are  obtained 
with  three  kev  depressions.  Tanh"'  .r  is  obtained  by  (©  (t]  . 
The  arc  and  hvper  kevs  prefix  keys  below  them  in  their  column. 

Log  .V  and  In  .v  obtain  the  log  to  the  base  10  and  the  log  to 
the  base  e  respectively.  The  inverse  of  the  natural  log  is  obtained 
with  the  key.  These  keys  are  useful  when  raising  numbers  to 
odd  powers  as  shown  in  one  of  the  examples  on  the  pull-out  card. 
Fig.  .3. 

Two  keys  in  this  group  are  very  useful  in  programs.  |T7]  takes 
the  integer  part  of  the  number  in  the  X  register  which  deletes 
the  part  of  the  number  to  the  right  of  the  decimal  point.  For 
e.xample  int(  — 3.1416)  =  —3.  (H)  forces  the  number  in  the  Y 
register  positive. 

Storage  registers 

Sixteen  registers,  in  addition  to  X.  Y,  and  Z.  are  available  for 
storage.  Fourteen  of  them,  0,  1.  2,  3.  4.  .5.  6.  7,  8,  9,  a,  b,  c,  d, 
can  be  used  to  store  either  one  constant  or  14  program  steps  per 
register.  The  last  registers,  e  and  f,  are  normally  used  only  for 
constant  storage  since  the  program  counter  will  not  cycle  into 


Part  3     The  instruction-set  processor  level:  variations  in  ttie  processor 


Section  4     Desk  calculator  computers:  keyboard  processors  with  small  memories 


them.  Special  keys  located  in  a  block  to  the  left  of  the  digit  keys 
are  used  to  identify  the  lettered  registers. 

To  store  a  number  from  the  X  register  the  key  is  used.  The 
parenthesis  indicates  that  another  key  depression,  representing  the 
storage  register,  is  necessary  to  complete  the  transfer.  For  example, 
storing  a  number  from  the  X  register  into  register  8  requires  two 
key  depressions:  ^oj  .  The  X  register  remains  unchanged.  To 
store  a  number  from  Y  register  the  key        is  used. 

The  contents  of  the  alpha  registers  are  recalled  to  X  simply 
by  pressing  the  keys  a,  b,  c,  d,  e,  and  f.  Recalling  a  number  from 
a  numbered  register  requires  the  use  of  the  ^=v)  key  to  distinguish 
the  recall  procedure  from  digit  entry.  This  key  interchanges  the 
number  in  the  Y  register  with  the  number  in  the  register  indicated 
by  the  following  keystroke,  alpha  or  numeric,  and  is  also  useful 
in  programs  since  neither  number  involved  in  the  transfer  is  lost. 

The  CLEAR  key  sets  the  X,  Y,  and  Z  display  registers  and  the 
f  and  e  registers  to  zero.  The  remaining  registers  are  not  affected. 
The  f  and  e  registers  are  set  to  zero  to  initialize  them  for  use  with 
the  and  keys  as  will  be  explained.  In  addition  the  CLEAR 
key  clears  the  FLAG  and  the  ARC  and  HYPER  conditions,  which 
often  makes  it  a  very  useful  first  step  in  a  program. 

Coordinate  transformation  and  complex  numbers 

Vectors  and  complex  numbers  are  easily  handled  using  the  keys 
in  the  column  on  the  far  left  of  the  keyboard.  Figure  5  defines 
the  variables  involved.  Angles  can  be  either  in  degrees  or  radians. 
To  convert  from  rectangular  to  polar  coordinates,  with  y  in  Y  and 
X  in  X,  press  Q.  Then  the  display  shows  S  in  Y  and  R  in  X.  In 


Y 

fl  =  tan-'| 
R  =  yx^  +  y^ 


X  =  R  cos  6 


Fig.  5.  Variables  involved  In  conversions  between  rectangular  and  polar 
coordinates. 


converting  from  polar  to  rectangular  coordinates,  6  is  placed  in 
Y,  and  R  in  X,  is  pressed  and  the  display  shows  y  in  Y  and 
X  in  X. 

ACC+  and  ACC—  allow  addition  or  subtraction  of  vector 
components  in  the  f  and  e  storage  registers.  ACC+  adds  the 
contents  of  the  X  and  Y  register  to  the  numbers  already  stored 
in  f  and  e  respectively;  ACC—  subtracts  them.  The  RCL  key 
recalls  the  numbers  in  the  f  and  e  registers  to  X  and  Y. 

Illegal  operations 

A  light  to  the  left  of  the  CRT  indicates  that  an  illegal  operation 
has  been  perfonned.  This  can  happen  either  from  the  keyboard 
or  when  mnning  a  program.  Pressing  any  key  on  the  keyboard 
will  reset  the  light.  When  running  a  program,  execution  will 
continue  but  the  light  will  remain  on  as  the  program  is  completed. 
The  illegal  operations  are: 

Division  by  zero 
\/x  where  x  <  0 

In  X  where  x  <  0;  log  n  where  x  <  0 

sin  '  X  where  |x|  >  1;  cos~^  x  where  |x|  >  1 

cosh"'  X  where  x  <  1;  tanh  '  x  where  |x|  >  1 

Accuracy 

The  Model  9100A  does  all  calculations  using  floating  point  arith- 
metic with  a  twelve  digit  mantissa  and  a  two  digit  exponent.  The 
two  least  significant  digits  are  not  displayed  and  are  called  guard 
digits. 

The  algorithms  used  to  perform  the  operations  and  generate 
the  functions  were  chosen  to  minimize  error  and  to  provide  an 
extended  range  of  the  argument.  Usually  any  inaccuracy  will  be 
contained  within  the  two  guard  digits.  In  certain  cases  some  in- 
accuracy will  appear  in  the  displayed  number.  One  example  is 
where  the  fimctions  change  rapidly  for  small  changes  in  the  argu- 
ment, as  in  tan  x  where  x  is  near  90°.  A  glaring  but  insignificant 
inaccuracy  occurs  when  an  answer  is  known  to  be  a  whole  number, 
but  the  least  significant  guard  digit  is  one  count  low: 
2.000  000  000  ~  1.999  999  999. 

Accuracy  is  discussed  further  in  the  'Internal  Programming' 
section  in  this  chapter.  But  a  simple  summary  is:  the  answer  result- 
ing from  any  operation  or  function  will  lie  within  the  range  of 
true  values  produced  by  a  variation  of  ±1  count  in  the  tenth  digit 
of  the  argmnent. 

Programming 

Problems  that  require  many  keyboard  operations  are  more  easily 
solved  with  a  program.  This  is  particularly  true  when  the  same 


Chapter  20     The  HP  Model  9100A  computing  calculator  247 


operations  must  be  performed  repeatedly  or  an  iterative  technique 
must  be  used.  A  program  library  supplied  with  the  Model  9I()()A 
provides  a  set  of  representative  programs  from  many  different 
fields.  If  a  program  cannot  be  found  in  the  library  to  solve  a 
particular  problem,  a  new  program  can  easily  be  written  since 
no  special  experience  or  prior  knowledge  of  a  programming  lan- 
gi^iage  is  necessary. 

Any  key  on  the  keyboard  can  be  remembered  by  the  calculator 
as  a  program  step  e.xcept  STEP  PRGM.  This  key  is  used  to  'debug" 
a  program  rather  than  as  an  operation  in  a  program.  Many  indi- 
vidual program  steps,  such  as  'sin  x'  or  'to  polar'  are  comparatively 
powerful,  and  avoid  the  need  of  sub-routines  for  these  fimctions 
and  the  programming  space  such  sub-routines  recjuire.  Registers 
0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  a,  b,  c,  d  can  store  14  program  steps 
each.  Steps  within  the  registers  are  numbered  0  through  d  just 
as  the  registers  themselves  are  numbered.  Programs  can  start  at 
any  of  the  196  possible  addresses.  However  0-0  is  usually  used  for 
the  first  step.  Address  d-d  is  then  the  last  available,  after  which 
the  program  counter  cycles  back  to  0-0. 

Registers  f  and  e  are  normally  used  for  storage  of  constants  only, 
one  constant  in  each  register.  .\s  more  constant  storage  is  required, 
it  is  recommended  that  registers  d,  then  c,  then  b,  etc.,  are  used 
starting  from  the  bottom  of  the  list.  Lettered  registers  are  used 
first,  for  the  frequently  recalled  constants,  because  constants  stored 
in  them  are  more  easily  recalled.  A  register  can  be  used  to  store 
one  constant  or  14  program  steps,  but  not  both. 

Branching 

The  bank  on  the  far  right  of  the  keyboard.  Fig.  4,  contains  program 
oriented  keys.  Q  is  used  to  set  the  program  counter.  The  two 
sets  of  parentheses  indicate  that  this  key  should  be  followed  b\' 
two  more  key  depressions  indicating  the  address  of  the  program 
step  desired.  As  a  program  step,  "GO  TO"  is  an  unconditional 
branch  instniction,  which  causes  the  program  to  branch  to  the 
address  given  by  the  next  two  program  steps.  The  'IF'  keys  in  this 
group  are  conditional  branch  instructions.  With  Q  Q,  andQ 
the  numbers  contained  in  the  X  and  Y  registers  are  compared. 
The  indicated  condition  is  tested  and,  if  met.  the  next  two  program 
steps  are  executed.  If  the  first  is  alphameric,  the  second  must  be 
also,  and  the  two  steps  are  interpreted  as  a  branching  address. 
When  the  condition  is  not  met,  the  next  two  steps  are  skipped 
and  the  program  continues.  is  also  a  very  useful  conditional 
branching  instniction  which  tests  a  "yes"  or  'no"  condition  inter- 
nally stored  in  the  calculator.  This  condition  is  set  to  'ves"  with 
the  SET  FL.\G  fioni  the  keyboard  when  the  calculator  is  in  the 


display  mode  or  from  a  program  as  a  program  step.  The  flag  is 
set  to  a  'no'  condition  by  either  asking  IF  FLAG  in  a  program 
or  bv  a  CLE.\R  instruction  from  the  keyboard  or  from  a  program. 

Data  input  and  output 

Data  can  be  entered  for  use  in  a  program  when  the  machine  is 
in  the  displas  mode.  (The  screen  is  blank  while  a  program  is 
running.)  program  can  be  stopped  in  several  ways.  The  key 
will  halt  the  machine  at  any  time.  The  operation  being  performed 
will  be  completed  before  returning  to  the  display  mode.  .\s  a 
program  step,  STOP  stops  the  program  so  that  answers  can  be 
displayed  or  new  data  entered.  END  must  be  the  la.st  step  in  a 
program  listing  to  signal  the  magnetic  card  reader;  when  encoun- 
tered as  a  program  step  it  stops  the  machine  and  also  sets  the 
program  counter  to  0-0. 

.\s  a  program  step,  P.\USE  causes  a  brief  displa\  during  pro- 
gram execution.  Nine  cycles  of  the  power  line  fre<|uency  are 
counted — the  duration  of  the  pause  will  be  about  1.50  ms  for  a  60 
Hz  power  line  or  180  ms  for  a  .50  Hz  power  line.  More  pauses 
can  be  used  in  sequence  if  a  longer  display  is  desired.  While  a 
program  is  running  the  P.\USE  key  can  be  held  down  to  stop  the 
machine  when  it  comes  to  the  next  P.^L'SE  in  the  program.  P.\USE 
provides  a  particularly  useful  wav  for  the  user  and  the  machine 
to  interact.  It  might,  for  instance,  be  used  in  a  program  so  that 
the  convergence  to  a  desired  result  can  be  observed. 

Other  means  of  input  and  output  involve  peripheral  devices 
such  as  an  X-Y  Plotter  or  a  Printer.  The  PRINT  key  activates  the 
printer,  causing  it  to  print  information  from  the  display  register. 
.\s  a  program  step,  PRINT  will  interrupt  the  program  long  enough 
for  the  data  to  be  accepted  by  the  printer  and  then  the  program 
will  continue.  If  no  printer  is  attached.  PRINT  as  a  program  step 
will  act  as  a  STOP.  The  FMT  ke\ ,  followed  by  any  other  keystroke, 
provides  up  to  62  unique  commands  to  peripheral  equipment.  This 
flexibility  allows  the  Model  9100.\  to  be  used  as  a  controller  in 
small  systems. 

Sample  program — \! 

\  simple  program  to  calculate  N!  demonstrates  how  the  Model 
9100A  is  programmed.  Figure  6  (top)  shows  a  flow  chart  to  com- 
pute Nl  and  Fig.  6  (bottom)  shows  the  program  steps.  With  this 
program,  60!  takes  less  than      second  to  compute. 

Program  entry  and  execution 

.\fter  a  program  is  written  it  can  be  entered  into  the  Model  9100A 
from  the  ke\board.  The  program  counter  is  set  to  the  address  of 


248  Part  3  j  The  instruction-set  processor  level:  variations  in  the  processor 


Section  4     Desk  calculator  computers:  keyboard  processors  with  small  memories 


f 

v- 

store  N 
p-1 

1  0^ 

N'— P 
Recall  N 

n— n  -  1 

END 

1 

1  

 1 

Disploy 

SToroge 

Step 

Key 

Code 

r 

f 

d 

0 

0 

) 

/ 

N 

2 

3 

4 

t 

5 

P'-l 

6 

n 

p 

7 

P 

8 

J. 

P 

9 

Oolli 

P 

1 

P-Prn 

P 

n^n  -! 

p 

Go  To 

P 

0 

0 

n 

P 

6 

P 

2 

/ 

0 

N'l^  P 

3 

END 

N 

0 

N' 

J 

Fig.  6.  Flow  chart  of  a  program  to  compute  N!  (top).  Each  step  is  shown 
(bottom)  and  the  display  for  each  register.  A  new  value  for  N  can  be 
entered  at  the  end  of  the  program,  since  END  automatically  sets  the 
program  counter  back  to  0-0. 


the  first  program  step  by  using  the  GO  TO  (  )  ( )  key.  The  RUN- 
PROGR.^M  switch  is  then  switched  from  RUN  to  PROGRAM  and 
the  program  steps  entered  in  sequence  by  pushing  the  proper  keys. 
As  each  step  is  entered  the  X  register  displays  the  address  and 
key  code,  as  shown  in  Fig.  7.  The  keys  and  their  codes  are  Hsted 
at  the  bottom  of  the  pull-out  card,  Fig.  3.  Once  a  program  has 
been  entered,  the  steps  can  be  checked  using  the  STEP  PRGM 
key  in  the  PROGRAM  mode  as  explained  in  Fig.  7.  If  an  error 


S.33EBH5BI  5 

05 

> 

Z  temporary 

a. 

DO 

y  accumulator 

3.d  

I      .  • 

35 

X  teyboard 

Fig.  7.  Program  step  address  and  code  are  displayed  in  the  X  register 
as  steps  are  entered.  After  a  program  has  been  entered,  each  step  can 
be  checked  using  the  STEP  PRGM  key.  In  this  display,  step  2-d  is  36, 
the  code  for  multiply. 


is  made  in  a  step,  it  can  be  corrected  by  using  the  key  without 
having  to  re-enter  the  rest  of  the  program. 

To  run  a  program,  the  program  counter  must  be  set  to  the 
address  of  the  first  step.  If  the  program  starts  at  0-0  the  keys 

(3  (3  depressed,  or  simply  just  (j^  since  this  key  auto- 
matically sets  the  program  counter  to  0-0.  CONTINUE  will  start 
program  execution. 

Magnetic  card  reader-recorder 

One  of  the  most  convenient  features  of  the  Model  9100A  is  the 
magnetic  card  reader-recorder.  Fig.  8.  A  program  stored  in  the 
Model  91()0A  can  be  recorded  on  a  magnetic  card.  Fig.  9,  about 


Fig.  8.  Programs  can  be  entered  into  the  calculator  by  means  of  the 
magnetic  program  card.  The  card  is  inserted  into  the  slot  and  the 
ENTER  button  pressed. 


Chapter  20  !  The  HP  Model  9100A  computing  calculator  249 


Fig.  9.  Magnetic  programming  card  can  record  two  196-step  programs. 
To  prevent  accidental  recording  of  a  new  program  over  one  to  be  saved, 
the  corner  of  the  card  is  cut  as  shown. 


the  size  of  a  credit  card.  Later  when  the  program  is  needed  again, 
it  can  be  quickly  re-entered  using  the  previously  recorded  card. 
Cards  are  easily  duplicated  so  that  programs  of  common  interest 
can  be  distributed. 

As  mentioned  earlier,  the  END  statement  is  a  signal  to  the 


reader  to  stop  reading  recorded  information  from  the  card  into 
the  calculator.  For  this  reason  END  should  not  be  used  in  the 
middle  of  a  program.  Since  most  programs  start  at  location  0-0 
the  reader  automatically  initializes  the  program  counter  to  0-0 
after  a  card  is  read. 

The  magnetic  card  reader  makes  it  possible  to  handle  most 
programs  too  long  to  be  held  in  memory  at  one  time.  The  first 
entry  of  steps  can  calculate  intermediate  results  which  are  stored 
in  preparation  for  the  next  part  of  the  program.  Since  the  reader 
stops  reading  at  the  END  statement  these  stored  intermediate 
results  are  not  disturbed  when  the  next  set  of  program  steps  is 
entered.  The  stored  results  are  then  retrieved  and  the  program 
continued.  Linking  of  programs  is  made  more  convenient  if  each 
part  can  execute  an  END  when  it  finishes  to  set  the  program 
counter  to  0-0.  It  is  then  only  necessary  to  press  CONTINUE  after 
each  entrv  of  program  steps. 

Hardware  design  of  the  Model  9100A  calculator 

.\ll  keyboard  functions  in  the  .Model  are  implemented  by 

the  arithmetic  processing  unit,  Figs.  10  and  IL  The  arithmetic 
unit  operates  in  discrete  time  periods  called  clock  cycles.  All 


Specifications  of  HP  Model  9100A' 

The  HP  Model  9100A  is  a  programmable, 
electronic  calculator  which  performs  opera- 
tions commonly  encountered  in  scientific 
and  engineering  problems.  Its  log,  trig  and 
mathematical  functionsareeach  performed 
with  a  single  key  stroke,  providing  fast, 
convenient  solutions  to  intricate  equa- 
tions. Computer-like  memory  enables  the 
calculator  to  store  instructions  and  con- 
stants for  repetitive  or  iterative  solutions. 
The  easily-readable  cathode  ray  tube  in- 
stantly displays  entries,  answers  and  inter- 
mediate results. 

Operations 

Direct  keyboard  operations  include: 

Arithmetic:  addition,  subtraction,  mul- 
tiplication, division  and  square-root. 

Logarithmic:  log  x.  In  x  and  e'-. 

Trigonometric:  sin  x,  cos  x,  tan  x, 
sin^'x,  cos"'x  and  tan^'x  (x  in  de- 
grees or  radians). 

Hyperbolic  :  sinh  x,  cosh  x,  tanh  x, 
sinh  'x,  cosh^'x,  and  tanh^'x. 


Coordinate  transformation:  polar  to- 
rectangular,     rectangular- to-polar, 
cumulative  addition  and  subtraction 
of  vectors. 
Miscellaneous:  other  single-key  opera- 
tions include — taking  the  absolute 
value  of  a  number,  extracting  the 
integer  part  of  a  number,  and  enter- 
ing the  value  of  -r.  Keys  are  also 
available  for  positioning  and  storage 
operations. 
Programming 
The  program   mode  allows  entry  of 
program  instructions,  via  the  keyboard, 
into  program  memory.  Programming 
consists  of  pressing  keys  in  the  proper 
sequence,  and  any  key  on  the  keyboard 
is  available  as  a  program  step.  Program 
capacity  is  196  steps.  No  language  or 
code-conversions  are  required.  A  self- 
contained   magnetic  card    reader  re- 
corder records  programs  from  program 
memory    onto    wallet-size  magnetic 
cards  for  storage.  It  also  reads  programs 
from  cards  into  program  memory  for 


repetitive  use  Two  programs  of  196 
steps  each  may  be  recorded  on  each 
reusable  card.  Cards  may  be  cascaded 
for  longer  programs. 

Speed 

Average  times  for  total  performance  of 
typical  operations,  including  decimal- 
point  placement: 

add,  subtract:  2  milliseconds 

multiply:  12  milliseconds 

divide:  18  milliseconds 

square-root:  19  milliseconds 

sin,  cos,  tan:  280  milliseconds 

In  x:  50  milliseconds 

e':  110  milliseconds 
These  times  include  core  access  of 
1.6  microseconds. 

General 

Weight:  Net  40  lbs.  (18.1  kg.):  shipping 

65  lbs.  (29.5  kg.). 
Power:  115  or  230  V  :t  10%,  50  to  60  Hz, 

400  Hz,  70  watts. 
Dimensions:  8'4"  high.  16"  wide,  19" 

deep. 


•Courtesy  of  Loveland  Division, 


250  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  4     Desk  calculator  computers:  keyboard  processors  with  small  memories 


Activate 
(Read  only) 


825  ns 
CLOCK 


Bit 
Capacity 


Description  - 


No  IVIemory- 


INPUT 
Keyboard 
Rear  Plug 


PROGRAM 
ROM 


512  WORD 
64  BIT/W 


z 


PROGRAM 
V^ORD 


PROGRAM 

ROM 
ADDRESS 
FLIP  FLOPS 


CONTROL 
WORD 


Activate 
(Read -Write) 


Activate 
(Read  only) 


\~  2K 


CONTROL 
ROM 


64  WORD 
29  BIT/W 
800  ns 


z 


[e]  CONTROL 
LOGIC 
ADDRESS 
FLIP  FLOP 


INFORMATION 
WORD 


|~2.2K  I 


COINCIDENT 
CURRENT 

CORE 
MEMORY 


368  WORDS 
6  BIT/W 


m] 


DATA 
FLIP  FLOPS 


z 


[51  CORE 
—  MEMORY 
ADDRESS 
FLIP  FLOPS 


Hard -Wired  Logic  Gates 

(Instructions) 


High  Order 
Memory 


Low  Order 
Memory 


OUTPUT 
Display 
Rear  Plug 
Illegal  Operation 
Light 


Fig.  10.  Arithmetic  processing  unit  block  diagram.  This  system  is  a  marriage  of  conventional,  reliable  diode-resistor  logic  to  a  32,000-bit  read-only 
memory  and  a  coincident  current  core  memory. 


operations  are  synchronized  by  the  clock  .shown  at  the  top  center 
of  Fig.  10. 

The  clock  is  connected  to  the  control  read  only  memory  (ROM) 
which  coordinates  the  operation  of  the  program  read  only  memory 
and  the  coincident  current  core  read/write  memory.  The  former 


Fig.  11.  Arithmetic  unit  assembly  removed  from  the  calculator. 


contains  information  for  implementing  all  of  the  keyboard  opera- 
tions while  the  latter  stores  user  data  and  user  programs. 

internal  operations  are  performed  in  a  digit  by  digit  serial 
basis  using  binary  coded  decimal  digits.  An  addition,  for  example, 
requires  that  the  least  significant  digits  of  the  addend  and  augend 
be  extracted  from  core,  then  added  and  their  sum  replaced  in  core. 
This  process  is  repeated  one  BCD  digit  at  a  time  until  the  most 
significant  digits  have  been  processed.  There  is  also  a  substantial 
amount  of  'housekeeping'  to  be  performed  such  as  aligning  decimal 
points,  assigning  the  proper  algebraic  sign,  and  floating  point 
normalization.  Although  the  implementation  of  a  keyboard  func- 
tion may  involve  thousands  of  clock  cycles,  the  total  elapsed  time 
is  in  the  millisecond  region  because  each  clock  cycle  is  only  825 
ns  long. 

The  program  ROM  contains  512  64-bit  words.  When  the  pro- 
gram ROM  is  activated,  signals  (micro-instructions)  corresponding 
to  the  bit  pattern  in  the  word  are  sent  to  the  hard  wired  logic 
gates  shown  at  the  bottom  of  Fig.  10.  The  logic  gates  define  the 
changes  to  occur  in  the  flip  flops  at  the  end  of  a  clock  cycle.  Some 
of  the  micro-instructions  act  upon  the  data  flip  flops  while  others 
change  the  address  registers  associated  with  the  program  ROM, 


Chapter  20  j  The  HP  Model  9100A  computing  calculator  251 


control  ROM  and  coincident  current  core  memory.  During  the 
next  clock  cycle  the  control  ROM  may  ask  for  a  new  set  of  micro- 
instructions from  the  program  ROM  or  ask  to  be  read  from  or 
written  into  the  coincident  current  core  memory.  The  control 
ROM  also  has  the  ability  to  modify  its  own  address  register  and 
to  issue  micro-instructions  to  the  hard  wired  logic  gates.  This 
flexibility  allows  the  control  logic  ROM  to  execute  special  pro- 
grams such  as  the  subroutine  for  unpacking  the  stored  constants 
required  by  the  keyboard  transcendental  functions. 

Control  logic 

The  control  logic  uses  a  wire  braid  toroidal  core  read  only  memory 
containing  64  29-bit  words.  Magnetic  logic  of  this  type  is  extremely 
reliable  and  pleasingly  compact. 

The  crystal  controlled  clock  source  initiates  a  current  pulse 
having  a  trapezoidal  waveform  which  is  directed  through  one  of 
64  word  lines.  Bit  patterns  are  generated  by  passing  or  threading 
selected  toroids  with  the  word  lines.  Each  toroid  that  is  threaded 
acts  as  a  transformer  to  turn  on  a  transistor  connected  to  the 
output  winding  of  the  toroid.  The  signals  from  these  transistors 
operate  the  program  ROM,  coincident  current  core,  and  selected 
micro-instructions. 

Coincident  current  core  read/write  memory 

The  2208  (6  X  16  X  2.3)  bit  coincident  current  memory  uses  wide 
temperature  range  lithium  cores.  In  addition,  the  X,  Y,  and  inhibit 
drivers  have  temperature  compensated  current  drive  sources  to 
make  the  core  memory  insensitive  to  temperature  and  power 
supply  variations. 

The  arithmetic  processing  unit  includes  special  circuitry  to 
guarantee  that  information  is  not  lost  from  the  core  memory  when 
power  is  turned  off  and  on. 

Power  supplies 

The  arithmetic  processing  unit  operates  from  a  single  — 15  volt 
supply.  Even  though  the  power  supply  is  highly  regulated,  all 
circuits  are  designed  to  operate  over  a  voltage  range  of  —  1.3.5 
to  -16.5. 

Display 

The  display  is  generated  on  an  HP  electrostatic  cathode  rav  tube 
only  11  inches  long.  The  flat  rectangular  face  plate  measures 
3%  X  4i%g  inches.  The  tube  was  specifically  designed  to  gener- 
ate a  bright  image.  High  contrast  is  obtained  bv  using  a  low 
transmissivity  filter  in  front  of  the  CRT.  .-Ambient  light  that  usually 
tends  to  'wash  out'  an  image  is  attenuated  twice  by  the  filter,  while 
the  screen  image  is  only  attenuated  once. 


.\11  the  displayed  characters  are  'pieces  of  eight.'  Sixteen  differ- 
ent symbols  are  obtained  by  intensity  modulating  a  figure  8  pattern 
as  shown  in  Fig.  12.  Floating  point  numbers  are  partitioned  into 
groups  of  three  digits  and  the  numeral  1  is  shifted  to  improve 
readability.  Zeros  to  the  left  of  the  most  significant  digit  and 
insignificant  zeros  to  the  right  of  the  decimal  point  are  blanked 
to  avoid  a  confusing  display.  Fixed  point  numbers  are  automati- 
cally rounded  up  according  to  the  decimal  wheel  setting.  A  fixed 
point  display  will  automatically  revert  to  floating  point  notation 
if  the  number  is  too  large  to  be  displayed  on  the  (;RT  in  fixed 
point. 

Multilayer  instruction  logfc  hoard 

\\\  of  the  hard  wired  logic  gates  are  synthesized  on  the  instruction 
logic  board  using  time-proven  diode-resistor  logic.  The  diodes  and 
resistors  are  located  in  separate  rows.  Fig.  13.  h\\  diodes  are 
oriented  in  the  same  direction  and  all  resistors  are  the  same  value. 
The  maze  of  interconnections  normally  associated  with  the  back 
plane  wirinsj  of  a  computer  are  located  on  the  six  internal  layers 
of  the  multilayer  instruction  logic  board.  Solder  bridges  and  acci- 
dental shorts  caused  bv  test  probes  shorting  to  leads  beneath 
components  are  all  but  eliminated  bv  not  having  interconnections 
on  the  two  outside  surfaces  of  this  multilayer  board.  The  instruc- 
tion logic  board  also  serves  as  a  motherboard  for  the  control  logic 
board,  the  two  coincident  core  boards  and  the  two  flip  flop  boards, 
the  magnetic  card  reader,  and  the  keyboard.  It  also  contains  a 
connector,  available  at  the  rear  of  the  calculator,  for  connecting 
peripherals. 

Flip  flops 

The  Model  91()()A  contains  40  identical  J-K  flip  flops,  each  having 
a  threshold  noise  immunity  of  2.5  volts.  Worst  case  design  tech- 
niques guarantee  that  the  flip  flops  will  operate  at  3  MHz  even 
though  1.2  MHz  is  the  maximum  operating  rate. 


Fig.  12.  Displayed  characters  are  generated  by  modulating  these  figures. 
The  digit  1  is  shifted  to  the  center  of  the  pattern. 


Fig.  13.  Printed-circuit  boards  which  make  up  the  arithmetic  unit  are,  left  to  right  at  top,  side  board,  control  logic,  flip  flop,  core  and  drivers,  core 
sense  amplifiers  and  inhibit,  flip  flop,  and  side  board.  Large  board  at  the  lower  left  is  the  multilayer  instruction  board,  and  the  program  ROM  is  at 
the  right.  The  magnetic  card  reader  and  its  associated  circuitry  are  at  the  bottom. 


Chapter  20  |  The  HP  Model  9100A  computing  calculator  253 


Program  read  only  memory 

The  32,768  bit  read  only  program  memory  consists  of  512  64-hit 
words.  These  words  contain  all  of  the  operating  subroutines,  stored 
constants,  character  encoders,  and  CRT  modulating  patterns.  The 
512  words  are  contained  in  a  16  layer  printer-circuit  board  having 
drive  and  sense  lines  orthogonally  located.  A  drive  line  consists 
of  a  reference  line  and  a  data  line.  Drive  pulses  are  inductively 
coupled  from  both  the  reference  line  and  data  line  into  the  sense 
lines.  Signals  from  the  data  line  either  aid  or  cancel  signals  from 
the  reference  line  producing  either  a  1  or  0  on  the  output  sense 
lines.  The  drive  and  sense  lines  are  arranged  to  achieve  a  bit 
density  in  the  ROM  data  board  of  1000  bits  per  square  inch. 

The  program  ROM  decoder/driver  circuits  are  located  directh 
above  the  ROM  data  board.  Thirty-two  combination  sense  ampli- 
fier, gated-latch  circuits  are  located  on  each  side  of  the  ROM  data 
board.  The  outputs  of  these  circuits  control  the  hard  wired  logic 
gates  on  the  instruction  logic  board. 

Side  boards 

The  program  ROM  printed  circuit  iioard  and  the  instruction  logic 
board  are  interconnected  by  the  side  boards,  where  preliminary 
signal  processing  occurs. 

The  keyboard 

The  keyboard  contains  6.3  molded  plastic  kevs.  Their  markings  will 
not  wear  off  because  the  lettering  is  imbedded  into  the  key  body 
using  a  double  shot  injection  molding  process.  The  kev  and  switch 
assembly  was  specifically  designed  to  obtain  a  pleasing  feel  and 
the  proper  amount  of  tactile  and  aural  feedback.  Each  kev  operates 
a  single  switch  having  gold  allov  contacts.  .\  contact  closure  acti- 
vates a  matri.x  which  encodes  signals  on  si.x  data  lines  and  generates 
an  initiating  signal.  This  signal  is  delaved  to  avoid  the  effects  of 
contact  bounce.  An  electrical  interlock  prevents  errors  caused  by 
pressing  more  than  one  key  at  a  time. 

Magnetic  card  reader 

Two  complete  196  step  programs  can  be  recorded  on  the  credit 
card  size  magnetic  program  card.  The  recording  process  erases 
any  previous  information  so  that  a  card  mav  be  used  over  and 
over  again.  A  program  may  be  protected  against  accidental  erasure 
by  clipping  off  the  corner  of  the  card.  Fig.  9,  page  249.  The  missing 
corner  deactivates  the  recording  circuitry  in  the  magnetic  card 
reader.  Program  cards  are  compatible  among  machines. 

Information  is  recorded  in  four  tracks  with  a  bit  density  of  200 
bits  per  inch.  Each  si.\-bit  program  step  is  split  into  two  time- 


multiplexed,  three-bit  codes  and  recorded  on  three  of  the  four 
tracks.  The  fourth  track  provides  the  timing  strobe. 

Information  is  read  from  the  card  and  recombined  into  six  bit 
codes  for  entry  into  the  core  memory.  The  magnetic  card  reading 
circuitry  recognizes  the  'END"  program  code  as  a  signal  to  end 
the  reading  process.  This  feature  makes  it  possible  to  enter  sub- 
routines within  the  body  of  a  main  program  or  to  enter  numeric 
constants  via  the  program  card.  The  END  code  also  sets  the 
program  counter  to  location  0-0,  the  most  probable  starting  loca- 
tion. The  latter  feature  makes  the  Model  9100.\  ideally  suited  to 
'linking'  programs  that  require  more  than  196  steps. 

Packaging  and  servicing 

The  packaging  of  the  .Model  9100."^  began  by  giving  the  HP  indus- 
trial design  group  a  volume  estimate  of  the  electronics  package, 
the  CRT  displa\  size  and  the  number  of  keys  on  the  keyboard. 
.Several  sketches  were  drawn  and  the  best  one  was  selected.  The 
electronics  sections  were  then  specifically  designed  to  fit  in  this 
case.  Much  time  and  effort  were  spent  on  the  packaging  of  the 
arithmetic  processing  unit.  The  photographs.  Figs.  11  and  14, 
attest  to  the  fact  that  it  was  time  well  spent. 

The  case  covers  are  die  ca.st  aluminum  which  offers  durabilit\ . 
effective  RFI  shielding,  excellent  heat  transfer  characteristics,  and 
convenient  mechanical  mounts.  Removing  four  screws  allows  the 
case  to  be  opened  and  locked  into  position,  Fig.  14.  This  procedure 
exposes  all  important  diagnostic  test  points  and  adjustments.  The 
keyboard  and  arithmetic  processing  unit  mav  be  freed  by  removing 
four  and  seven  screws  respectively. 

.Any  component  failures  can  be  isolated  by  using  a  diagnostic 
routine  or  a  special  tester.  The  faulty  assembly  is  then  replaced 
and  is  sent  to  a  service  center  for  computer  assisted  diagnosis  and 
repair. 

Reliability 

Extensive  precautions  have  been  taken  to  insure  maximum  relia- 
bility. Initially.  \\  ide  electrical  operating  margins  were  obtained 
by  using  "worst  case"  design  techniques.  In  production  all  transis- 
tors are  aged  at  S()%  of  rated  power  for  96  hours  and  tested  before 
being  used  in  the  Model  9100A.  Subassemblies  are  computer  tested 
and  actual  operating  margins  are  monitored  to  detect  trends  that 
could  lead  to  failures.  These  data  are  analyzed  and  corrective 
action  is  initiated  to  reverse  the  trend.  In  addition,  each  calculator 
is  operated  in  an  environmental  chamber  at  55  °C  for  5  davs  prior 
to  shipment  to  the  customer.  Precautions  such  as  these  allow 
Hewlett-Packard  to  offer  a  one  year  warranty  in  a  field  where  90 
davs  is  an  accepted  standard. 


254  Part  3  j  The  instruction-set  processor  level:  variations  in  the  processor 


Section  4  j  Desk  calculator  computers:  keyboard  processors  with  small  memories 


Fig.  14.  Internal  adjustments  of  the  calculator  are  easily  accessible  by 
removing  a  few/  screws  and  lifting  the  top. 


Internal  programming  of  the  9100A  calculator 

Extensive  internal  programming  has  been  designed  into  the  HP 
Model  9100A  Calculator  to  enable  the  operator  to  enter  data  and 
to  perform  most  arithmetic  operations  necessary  for  engineering 
and  scientific  calculation  with  a  single  key  stroke  or  single  program 
step.  Each  of  the  following  operations  is  a  hardware  subroutine 
called  bv  a  key  press  or  program  step: 

Basic  arithmetic  operations 
Addition 
Subtraction 
Multiplication 
Division 

Extended  arithmetic  operations 
Square  root 
Exponential — e" 
Logarithmic — In  x,  log  x 
Vector  addition  and  subtraction 


Trigonometric  operations 
Sin  X,  cos  X,  tan  x 
Arcsin  x,  arccos  x,  arctan  x 
Sinh  X,  cosh  x,  tanh  x 
Arcsinh  x,  arccosh  x,  arctanh  x 
Polar  to  rectangular  and  rectangular  to 
polar  coordinate  transformation 

Miscellaneous 
Enter  w 

Absolute  value  of  y 
Integer  value  of  x 

In  the  evolution  of  internal  programming  of  the  Model  9100A 
Calculator,  the  first  step  was  the  development  of  flow  charts  of 
each  function.  Digit  entry,  Fig.  15,  seemingly  a  trivial  function, 
is  as  complex  as  most  of  the  mathematical  fimctions.  From  this 
functional  description,  a  detailed  program  can  be  written  which 
uses  the  microprograms  and  incremental  instructions  of  the  calcu- 
lator. Also,  each  program  must  be  married  to  all  of  the  other 
programs  which  make  up  the  hard-wired  software  of  the  Model 
910()A.  Mathematical  functions  are  similarly  programmed  defining 
a  step-by-step  procedure  or  algorithm  for  solving  the  desired 
mathematical  problem. 

The  calculator  is  designed  so  that  lower-order  subroutines  may 
be  nested  to  a  level  of  five  in  higher-order  fimctions.  For  instance, 
the  'Polar  to  Rectangular'  function  uses  the  sin  routine  which  uses 
multiply  which  uses  add,  etc. 

Addition  and  subtraction 

The  most  elementary  mathematical  operation  is  algebraic  addi- 
tion. But  even  this  is  relatively  complex — it  requires  comparing 
signs  and  complementing  if  signs  are  unlike.  Because  all  numbers 
in  the  Model  91()0A  are  processed  as  true  floating  point  numbers, 
exponents  must  be  subtracted  to  determine  proper  decimal  align- 
ment. If  one  of  the  numbers  is  zero,  it  is  represented  in  the  calcu- 
lator by  an  all-zero  mantissa  with  zero  exponent.  The  difference 
between  the  two  exponents  determines  the  offset,  and  rather  than 
shifting  the  smaller  number  to  the  right,  a  displaced  digit-by-digit 
addition  is  performed.  It  must  also  be  determined  if  the  offset  is 
greater  than  12,  which  is  the  resolution  limit. 

Although  the  display  shows  10  significant  digits,  all  calculations 
are  performed  to  12  significant  digits  with  the  two  last  significant 
digits  (guard  digits)  absorbing  truncation  and  round-off  errors.  All 
registers  are  in  core  memory,  eliminating  the  need  for  a  large 
number  of  flip-flop  registers.  Even  with  the  display  in  "Fixed  Point' 
mode,  every  computed  result  is  in  storage  in  12  digits. 


Chapter  20  j  The  HP  Model  9100A  computing  calculator  255 


Yes  From  CLEAR 


STORE  DIGIT  IN 
LEAST  SIGNIFICANT 
EXPONENT  LOCATION 


STORE  DIGIT  IN 
MOST  SIGNIFICANT 
LOCATION 


READ  MOST 
SIGNIFICANT 
DIGIT  LOCATION 


READ  NEXT 
MOST  SIGNIFICANT 
DIGIT  LOCATION 


STORE  NEW 
DIGIT  IN 
THIS  LOCATION 


Fig.  15.  Flow  chart  of  a  simple  digit  entry.  Some  of  these  flow  paths 
are  used  by  other  calculator  operations  for  greater  hardware  efficiency. 


Multiplication 

Multiplication  is  successive  addition  of  the  multiplicand  as  deter- 
mined by  each  multiplier  digit.  Offset  in  the  digit  position  flip-flops 
is  increased  by  one  after  completion  of  the  additions  by  each 
iiuiltiplier  digit.  Exponents  are  added  after  completion  of  the 
product.  Then  the  product  is  normalized  to  justify  a  carry  digit 
which  might  have  occurred. 

Division 

Division  in\olves  repeated  subtraction  of  the  divisor  from  the 
dividend  until  an  overdraft  occurs.  At  each  subtraction  without 
overdraft,  the  quotient  digit  is  incremented  by  one  at  the  digit 
position  of  iteration.  When  an  overdraft  occurs,  the  dividend  is 
restored  bv  adding  the  divisor.  The  division  digit  position  is  then 
incremented  and  the  process  continued.  E.xponents  are  subtracted 
after  the  (luolient  is  formed,  and  the  <iuotient  normalized. 

Square  root 

Square  root,  in  the  .Model  91()().\,  is  considered  a  basic  operation 
and  is  done  by  pseudo  division.  The  method  used  is  an  extension 
of  the  integer  relationship. 

V  2i  -  1  =  M- 

1  =  1 

In  square  root,  the  divisor  digit  is  incremented  at  each  iteration, 
and  shifted  when  an  overdraft  and  restore  occurs.  This  is  a  verv 
fast  algorithm  for  square  root  and  is  equal  in  speed  to  division. 

Circular  routines 

The  circular  routines  (sin.  cos,  tan),  the  inverse  circular  routines 
(arcsin,  arccos,  arctan)  and  the  polar  to  rectangular  and  rectangu- 
lar to  polar  conversions  are  all  accomplished  by  iterating  through 
a  transformation  which  rotates  the  a.xes.  .\ny  angle  ma\'  be  repre- 
sented as  an  angle  between  0  and  1  radian  plus  additional  infor- 
mation such  as  the  number  of  times  77/2  has  been  added  or  sub- 
tracted, and  its  sign.  The  basic  algorithm  for  the  forward  circular 
fiuiction  operates  on  an  angle  whose  absolute  value  is  less  than 
1  radian,  but  prescaling  is  necessary  to  indicate  quadrant. 

To  obtain  the  scaling  constants,  the  argimient  is  divided  by  277, 
the  integer  part  discarded  and  the  remaining  fraction  of  the  circle 
multiplied  bv  27?.  Then  77  2  is  subtracted  from  the  absolute  value 
until  the  angle  is  less  than  1  radian.  The  number  of  times  77/2 
is  subtracted,  the  original  sign  of  the  argument,  and  the  sign  upon 
completion  of  the  last  subtraction  make  up  the  scaling  constants. 
To  preserve  the  quadrant  information  the  scaling  constants  are 
stored  in  the  core  memorv  . 


256  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  4     Desk  calculator  computers:  keyboard  processors  with  small  memories 


The  algorithm  produces  tan  6.  Therefore,  in  the  Model  9100A, 
cos  d  is  generated  as 


Vr+tan¥ 
and  sin  6  as 

tang 

vTT~tan¥ 

Sin  6   could   be   obtained   from   the   relationship   sin  d 


\/\  —  cos^d,  for  example,  but  the  use  of  the  tangent  relationship 
preserves  the  12  digit  accuracy  for  very  small  angles,  even  in  the 
range  of  ^  <  10"^'-.  The  proper  signs  of  the  functions  are  assigned 
from  the  scaling  constants. 

For  the  polar  to  rectangular  fimctions,  cos  S  and  sin  6  are  com- 
puted and  multiplied  by  the  radius  vector  to  obtain  the  X  and 
Y  coordinates.  In  performing  the  rectangular  to  polar  function, 
the  signs  of  both  the  X  and  Y  vectors  are  retained  to  place  the 
resulting  angle  in  the  right  quadrant. 

Prescaling  must  also  precede  the  inverse  circular  functions, 
since  this  routine  operates  on  arguments  less  than  or  equal  to  1. 
The  inverse  circular  algorithm  yields  arctangent  functions,  making 
it  necessary  to  use  the  trigonometric  identity. 

sin"'(.v)  =  tan"' 


If  cos~'(.r)  is  desired,  the  arcsin  relationship  is  used  and  a  scaling 
constant  adds  77/2  after  completion  of  the  fimction.  For  argi^nnents 
greater  than  1,  the  arccotangent  of  the  negative  reciprocal  is  found 
which  yields  the  arctangent  when  7t/2  is  added. 

Exponential  and  logarithms 

The  exponential  routine  uses  a  compound  iteration  algorithm 
which  has  an  argimient  range  of  0  to  the  natural  log  of  10  (In  10). 
Therefore,  to  be  able  to  handle  any  argument  within  the  dynamic 
range  of  the  calculator,  it  is  necessary  to  prescale  the  absolute 
value  of  the  argument  bv  dividing  it  by  In  10  and  saving  the  integer 
part  to  be  used  as  the  exponent  of  the  final  answer.  The  fractional 
part  is  multiplied  by  In  10  and  the  exponential  found.  This  number 
is  the  mantissa,  and  with  the  previously  saved  integer  part  as  a 
power  of  10  exponent,  becomes  the  final  answer. 


The  exponential  answer  is  reciprocated  in  case  the  original 
argument  was  negative,  and  for  use  in  the  hyperbolic  functions. 
For  these  hyperbolic  functions,  the  following  identities  are  used: 

sinh  .V  =  —  — 

2 

cosh  X  =  ^  + 
2 


tanh  X 


Natural  logarithms 

The  exponential  routine  in  reverse  is  used  as  the  routine  for  natural 
logs,  with  only  the  mantissa  operated  upon.  Then  the  exponent 
is  multiplied  by  In  10  and  added  to  the  answer.  This  routine  also 
yields  these  logj^  and  are  hyperbolic  functions: 
In.v 


Logi,;,v  = 


In  10 


sinh"i(.v)  =  ln(.r  +  V^'  +  1) 
cosh-i(.v)  =  ln(.v  +  V.v^  -  1) 
tanh" '(at)  —  In 


/I  +  X 


The  sinh"'(x)  relationship  abbve  yields  reduced  accuracy  for 
negative  values  of  x.  Therefore,  in  the  Model  9100A,  the  absolute 
value  of  the  argument  is  operated  upon  and  the  correct  sign  affixed 
after  completion. 

Accuracy 

It  can  be  seen  from  the  discussion  of  the  algorithms  that  extreme 
care  has  been  taken  to  use  routines  that  have  accuracy  commensu- 
rate with  the  dynamic  range  of  the  calculator.  For  example;  the 
square  root  has  a  maximum  possible  relative  error  of  1  part  in 
10'"  over  the  fidl  range  of  the  machine. 

There  are  many  algorithms  for  determining  the  sine  of  an  angle; 
most  of  these  have  points  of  high  error.  The  sine  routine  in  the 
Model  9100A  has  consistent  low  error  regardless  of  quadrant. 
Marrying  a  fidl  floating  decimal  calculator  with  unique  mathe- 
matical algorithms  results  in  accuracy  of  better  than  10  displayed 
digits. 


Section  5 


Processors  with  stack  memories 
(zero  addresses  per  instruction) 

This  section  contains  only  computers  which  use  a  stack  memory 
in  their  Pc  and  hence  are  denoted  Pc. stack.  Although  the  im- 
plementation details  differ,  they  are  based  on  the  common  idea 
of  a  stack  as  described  m  Chap.  3,  page  62.  Several  theory  or 
language-based  processors— IPL-VI  and  EULER— use  a  stack  in 
Mp.  However,  for  these  language-based  machines  the  stack  is 


not  the  main  design  theme  as  It  is  with  the  other  computers 

in  Table  1.  In  fact,  data  in  IPL-VI  are  organized  (Chap.  30)  about 
lists,  which  are  a  more  general  data  structure  than  stacks.  A 
stack  permits  push  and  pop  operations  to  be  performed  on  the 
top  of  the  stack;  a  list  permits  push  and  pop  operations  to  be 
performed  on  each  cell  of  the  list  (they  are  then  called  insert 


Table  1    Pc. stack  computers 


Compaity  or  /««i.s 

Disetoaure 

Dclhenj 

ReUitiie 

computer  name 

date" 

(kite 

Ancestry 

pouer 

Reference', 

English  Electnc  KDF  9 

/60 

4  63 

George' 

AllmR62.  DaviG60, 
HambC62 

Burroughs  (Paoli,  Pa.) 

D825'' 

/61 

hr^An  ICO 

D830'' 

extended  performance 
D825 

B  8500'' 

4/66" 

1  67' 

developed  at  labora- 
tory producing  D825. 
D830 

20-30 

Burroughs  (Pasadena,  Calif.) 

B  5000 

/62 

2  63 

1  2 

AllmR62,  BartRei, 
BockR63.  CarlC63, 

B  5500 

11  64 

successor  to  B  5000 

1-1.7«-1.9» 

LoneWei.  HaucE68 

B  6500 

1,68' 

B  5500  based  with 
improved  multi-  and 
shared -programmed 
mapping 

5-6 

B  7500 

extended  performance 
B6500 

10 

Theory  or  language- 

based: 

IPL-VI 

/58 

language:IPL-IV.  V 

ShawJ58 

EULER 

/67 

/67 

language:EULER(ALGOL  +  ) 

WebeH67.  WirtN66a,b 

ALGOL 

language:ALGOL 

AndeJ61 

Argonne  Laboratory 

IPL-VC 

language:!  PL- V 

HodgD64 

"  First  edition  of  manual,  or  a  paper,  or  the  appearance  m  Adams  Computing  Characteristics  Quarterli/ 
'Still  evolving.  B  8501  was  discontinued  in  1968. 

f  George,  University  of  New  Soutti  Wales,  interpreter  using  Polish  notation  and  a  stack.  Circa  1957  [Hamblin.  1952]. 
''Produced  for  command  and  control  (military)  applications. 
'B  8500  IS  a  system  name;  the  Pc  is  a  B  8501. 
'Reported.  Actual  delivery  unknown. 
"  Dual  processor. 

257 


258  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  5  j  Processors  with  stack  memories  (zero  addresses  per  instruction) 


Hp(#0:7)— S?pPc=  (#A;B) 

LKio(#l:'l) —  1 — K— Kconsole;   typewriter)  - 

-K— T(#l:2;  card;  reader)f- 
-K — T(#l:2;  paper  tape;  reader)^ 
-K — T{card;  punch)-* 
-K— T(#1  :2  1  ine;  printer)-* 
-  K-Ms(#l  :2;  drum) 
-K  — M5(#I:I6;  magnetic  tape)  - 

'Mp(core;  4  g,s/w;  I4O96  w;    ClSJ)  b/w) 

^Pc(stacl(;  12  b/syltable;  6  b/char;  data:  s  i  ,sf ,  bv  ,w,char . 
string;  (I  ^  2)  syllable/instruction;  Mp5{~'*  w)  ante- 
cedents: 'ALGOL  language;  descendants;  'B  5000,  B  6500, 
B  7500;   technology:   transistor;  -(1961  ~  1963)) 

^S(from:   2  Pc,4  K;   to:  8  Mp;  concurrency:  4) 

*S(from:  ^  Kio;  to:   KT.KMs;  concurrency:  k) 


Fig.  1.  Burroughs  B  5000  PMS  diagram. 


and  delete,  respectively).  Thus  a  list  is  like  a  nested  set  of 
overlapping  stacks.  EULER  (Chap.  32)  uses  a  stack  to  store 
temporary  data  and  subroutine  calls  both  when  compiling  and 
vi^hen  interpreting  the  compiled  program.  However,  the  lan- 
guage-based machines  can  still  be  studied  profitably  with  the 
stack  in  mind. 

The  following  comments  will  be  directed  to  the  P. stack  com- 
puters manufactured  by  both  English  Electric  and  Burroughs. 
There  are  three  basic  P. stack  computer  families:  B  5000^  B 
5500 -»  B  6500/ B  7500;  D825  ^  D830  ^  B  8500;  and  KDF9. 
Each  root  member  was  made  available  at  about  the  same  time 
by  Burroughs  (Pasadena,  Calif.),  Burroughs  (Paoli,  Pa.),  and 
English  Electric.  The  IBM  Corporation  later  responded  with  a 
proposed  Pc. stack,  but  the  machine  never  entered  the  produc- 
tion phase. 

The  Pc. stack  is  a  major  alternative  to  the  main  line  organi- 
zation of  1  address  per  instruction  (augmented  with  index  reg- 
isters or  general  registers).  It  tries  to  capitalize  on  the  hierarchi- 
cal character  of  computation  to  avoid  having  to  give  memory 
shuffling  instructions  explicitly.  In  Chap.  3,  page  64,  we  gave 
a  comparison  of  a  trivial  computation  using  a  stack  and  a 
general-register  organization,  in  order  to  make  clear  the  case 


Mp(#0:3l)'- 


?c(HH)   T. console 

Pc(#B)^  T. console 

—  S'^-rKio(#Ii:10)''  S 


'Data 
Channel 
SwI tch I ng 


-K(#I ;20)  


L ( ' Real  Time  Devi  ce)- 


-K(#I  :I|)  c"  S  — K(#l  ;Ii)— S  —  K(#l  :61|)-Lr'Telephone  1  IneH- 

:  :Lioo~i8o  b/s  J 


'Mp((core;   1.2  ps/w)|(thin  film;   .6  ps/w) ;   16  kw;  51  b/w) 
==5(32  Mp;  l((Pc,K,S);  concurrency:  li) 

^Pc(stack;  technology:  Integrated  circuits;  —1969;  data;  sf  ,df  ,  i  ,  char .  s  t  ri  ng  , 

boolean  vector,  address  integer;  '4,6,8  b/char) 
*C('Data  Communications  Processor) 

^Identical  peripheral  structures  possible  with  two  switches 

^ See  Figures  3,  ^,  and  5. 

' Kio ( ' I nput/Output  Multiplexor) 

°Kio('Real  Time  Adapter) 


Fig.  2.  B  6500,  B  7500  PMS  diagram. 


Section  5  j  Processors  with  stack  memories  (zero  addresses  per  instruction)  259 


—  L'  

1   K  for  1  MsCdisk) 

—  L         K  S  -p-X 

I — X 

1   K  for  2  Ms(disk) 


S(2K;  5X)__X(#I 
— I  Lx 


-l-kJ  L 

c.     2  K  for  5  Ms(di5k) 


S('4K;  lOX)- 


—  L  K — I 

—  L  K  — 

—  L         K  — 

—  L          K  — 

d.     Il  K  for  10  Hs(disk) 


-X(/»l:9) 
—X{H]0) 


L(to:   Kio ( ' I nput/Output  Multiplexor)) 
"KCDisk  Peripheral  Controller) 
'X  :=  (  K{'Electronics  Unit)  S  M; 


(«1:5)*) 


■'Msff ! 
l_20 


xed  head  disk;  0  ~  ifS  ms ;   (2l6j395)  kby/s; 


20  X    10  by; 


b/by 


Fig.  3.  Burroughs  B  6500.  B  7500  Ms  (dis1<)  PMS  diagrams. 


for  stacks.  However,  we  did  not  there  attempt  any  analysis.  It 
has  been  asserted  [Amdahl  et  al.,  1964a]  that  the  Pc. stack 
derives  its  power  only  from  its  having  some  fast-working  mem- 
ory in  the  Pc,  thus  that  it  is  dominated  by  the  general-register 
organization.  Our  own  feeling  is  that  the  compile  and  compiled 
program  execution  times  for  the  Pc. stack  are  indeed  impressive. 
However,  no  definitive  analysis  has  been  published,  as  far  as 
we  know.  Pc. stack  is  certainly  an  organization  that  rates  serious 
study  by  any  computer  designer. 

The  PMS  structure  of  the  examples 

The  PMS  structure  diagram  of  the  B  5000  and  B  6500/B  7500 
(Figs  1  to  5)  should  be  compared  with  Burroughs  own  structure 
representation  (Chap.  22,  page  268).  The  D825  structure  is 
similar;  it  is  given  in  Chap.  36,  page  447.  All  the  Burroughs 
computers  in  Table  1  have  the  multiprocessor  structure. 

Burroughs  was  probably  the  first  computer  company  to  take 
matters  of  the  structure  and  organization  seriously.  The  D825 
hardware  and  software  were  designed  for  military  command 


and  control  applications  which  demand  very  high  uptime  and 
availability.  As  various  computer  components  in  the  structures 
fail,  continuous  operation  is  possible  at  a  reduced  level  through 
the  fail-soft  design.  However,  to  our  knowledge,  no  published 
account  exists  on  how  well  this  design  works  in  practice  from 
a  performance  and  reliability  viewpoint.  The  philosophy  and 
details  of  the  D825  software  and  hardware  are  discussed  in 
Chap.  36. 

The  structures  in  the  B  6500,  especially,  allow  Kio's  to  be 
freely  assigned  to  any  T  or  Ms,  thereby  achieving  better  equip- 
ment utilization.  The  S(16  Mp;  16  P)  is  probably  overdesigned 
in  the  Burroughs  B  6500  computers.  These  structures  generally 
have  a  maximum  4(P  +  Kio).  although  the  design  is  based  on 
16(P  +  Kio).  The  Kio's  (Chap.  22)  may  be  overdesigned,  too, 
since  a  K  capable  of  controlling  a  simple  T.card_reader  can 
also  control  a  complex  Ms. disk  or  Ms.magnetic_tape. 

The  PMS  structure  of  the  English  Electric  KDF9  (Fig.  6) 
is  fairly  simple.  The  16  K  s  for  direct  memory  access  appear 


—  l'  —    — S= — -0:7;  magnetic  tape; 

9  ~  11)1.  kchar/s;  6|8 
b/char;  2O0|556|800| 
1600  char/in;  foniard 

ay.d  reverse  motion 

a.     )  K  for  8  Ms (magnet i c  tape) 


—  L — K-pS(2  K;   10  Ms)  Ms(«0:9;  magnetic  tape). 

—  L —  K  J 

b.     2  K  for  10  Ms(magnetic  tape) 


—  L—  K-^SClK;   16  Ms)  Ms  (/lO:  15;  magnetic  tape). 

—  L—  K- 

—  L—  K- 

—  L—  K  — 


c.    h  K  for  16  Hs(magnetic  ta 


'L(to:  Kio( ' Input/Output  Multiplexor)) 
^K('Per!pheral  Controller) 

3S(1K;  8  Ms;  bus) 


Fig.  4.  Burroughs  B  6500,  B  7500  Ms  (magnetic  tape)  PMS  diagrams. 


260  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  5  |  Processors  vifith  stack  memories  (zero  addresses  per  instruction) 


K  T(console;  keyboard,  printer)*- 

K  T(card;  reader)  <- 

K  T(card;  punch)-* 

_L   

K  T(paper  tape;  reader)  *- 

K             T(paper  tape;  punch)  — > 

K             T(CRT;  display)  -> 

K             Tdine;  printer)  -. 

'L(to:  Kio 

'Small   Peripheral  Control)) 

Fig.  5.  Burroughs  B  6500,  B  7500  peripheral  K — T  PMS  diagrams. 


to  be  both  overdeslgned  (or  overly  general)  and  there  are  too 
few  of  them.  The  limit  of  only  16(T  +  Ms)  components  is  small, 
especially  considering  that  the  KDF9  is  to  be  time-shared  from 
several  consoles. 


hardware  stack  resides  in  Pc.  The  B  5500  has  a  local  M. stack 
in  Pc  of  4  words.  The  size  and  number  of  stacks,  and  their 
use  by  software,  are  most  important.  The  IPL-VI  machine 
has  any  number  of  stacks  since  the  front  of  each  list  is  a  stack. 
The  KDF9  (Fig.  6)  has  two  independent  stacks:  one  for  arith- 
metic expression  evaluation  and  one  for  holding  subroutine 
return  addresses.  The  DEC  338  P. display  (Chap.  25)  uses  a 
stack  for  storing  subroutine  return  addresses. 

Unfortunately,  we  have  not  been  able  to  include  a  discussion 
of  the  "cactus  stack"  of  the  B  6500,  which  is  a  data  structure 
more  like  a  list  [Hauck  and  Dent,  1968],  The  Hauck  and  Dent 
paper  describes  both  the  relationship  to  a  Pc. stack  and  its 
relevance  to  program  mapping  and  memory  management  for 
multiprogramming. 

The  C('D825)  parameters  are  given  in  Fig.  7.  The  D825  ISP 
differs  from  other  Pc. stack  computers  in  that  the  data,  d,  for 
operations  can  be  in  either  of  two  places,  the  stack  or  Mp. 
Consider  the  unary  or  binary  operations: 


The  ISP  of  the  examples 

The  comparison  of  Pc. stack,  Pc.laddress,  and  Pc.generaL reg- 
isters (page  64)  makes  the  assumption  that  an  unlimited 


Mpdyons)' —  s- 


Z.rK(/'l)  S  Mslmagnetic  tape)- 


~  K- 

—  K  - 


.T(typewriter)- 
-T(paper  tape)- 


:Kpi6; 
[_t  ran 


data  vector 
ransmi ss I  on  to  MpJ 


'MpCcore;  6  us/w;  4  ~  32  kw;  hi  b/w) 
=5(16  Mp;   16(P,K);  concurrency:  l) 

^Pc(stack;  8  b/syllable;  0  ~  1   address/ i ns t rue t i on ;  6  b/char; 
technology:   transistor;  data:   syllable,  char,  w,  by,  si, 
sf,  df,  hw;   1-3  sy  1  1  abl  es/ i  ns  t  rue  t  i  on  ;  operators:  , 
■  ,  /,  A,  V,  0,*-  { char  .St  rl  ng}  ,  Mp      stack,  stack  *-  Mp; 
Mps ( ' Subrouti ne  Jump  Nesting  Store [0 : 7]<n : 1 7>  stack: 
'Nesting  Store  [0  :  1  5]<0 : '47>  arithmetic  stack; 
'()-store[0: 15]<0: 17, 18:31  ,32:'l5>  0-etore  in  used  for 
indexinQj  and  contains  a  counter^  an  increment,  and  a 
modifier) ) 


C('BurrQughs  D825;  multiprocessor  structure; 
S(cross-point;  16  M;  l6(Pc,Kio)) 
Mp('(.33  k.5/w;  65  kw;    (k%,\  parity)  b/w); 
S(cross-point;  k  Kio;  6')  (T,Ms)); 

T(consoIe,  paper  tape,  printer,  card,  time,  communication 
link); 

Ms(drum,  disk,  magnetic  tape); 
K\o(tl\:k)  ■ 

Pc(#l:2;   12  b/syllable;  stack;    0~3  addresses/ I nst ruct ion ; 
mul t iprogrammed ;  data:   (integer,  floating,  single  char- 
acter,  fractional  precision  word,  boolean  vector);  opera- 
tions:  (-^,  -,  X,  /,  A,  V,  ffi,  ^,  round,  [si)  <-    [sf]  ,  abs, 
negate , -abs )  ; 

instruction-size:    (1  ~  7)  syllable; 

operation-code-size:  5/12  syllable; 

address-size:   (7/12  +  0~  6)  syllable; 

opera  t  ion  forms :    (d3<-dl  bd2,  d2<-udl); 

variable  addresses:   (stack,  Mp[syllable  -i-  BAR  ],Mp  [sy  I  I  ab  1  e 

+  BAR  ■^  X[A]  -I-  X   [B  ]  -^  X   [C  ]])  ; 
Mps ( ' Stack/S,   Index  Reg  I s ters [I : 1 5 ]/X [ 1 : 1 5 ] , 

'Index  Comparison  Limit  Reg i s te rs [1 : 1  5 ], 

'Base  Address  Reg i sters/BAR , 

'Program  Address  Reg i s ter/PAR , 

'Program  Counter/PC))) 


Fig.  6.  English  Electric  KDF9  PMS  diagram. 


Fig.  7.  Burroughs  D825  PMS  diagram. 


Section  5  ^  Processors  with  stack  memories  (zero  addresses  per  instruction)  261 


do  ^  u  d, 
d;,  ^d,  bd.. 

In  either  of  these  cases  dj,  d._„  or  d,  can  be  the  top  of  Stack/S; 
or  Mp[Address  +  Base  Address  +  [llndex  registers  [A, B,C]]]. 
This  flexibility  allows  the  Pc  to  behave  as  a  0,  1,  2,  or  3  address 
per  Instruction  processor. 

The  B  5000  Is  more  conventional  than  the  D825  In  Its  use 
of  stacks  (see  references,  Table  1).  There  are  only  load  and 
store  (that  Is,  push  and  pop  Instructions)  to  transfer  data  be- 
tween Mp  and  one  stack.  Actually,  the  B  5000  has  several  im- 
portant features  that  make  It  worthy  of  study: 

1  The  stacks. 

2  Data-type  specification.  A  data  type  Is  declared  by  placing 
a  type  Identifier  with  the  data.  Thus,  for  example,  there 
Is  one  add  operation  for  both  fixed  and  floating  point, 
the  data  telling  which  addition  Is  to  take  place. 

3  Multlprogram  mapping.  Descriptors  are  used  to  access 
variables  (scalars,  vectors,  and  arrays).  This  Indirect 


addressing  technique  allows  multiprogramming;  how- 
ever, the  reader  should  note  that  the  data  are  not  pro- 
tected against  other  accesses  (corrected  in  the  B  6500). 

4  Failure  of  the  Pc. stack  for  character  processing.  The 
B  5000  has  a  character  mode  to  allow  processing  of 
string  data,  and  the  stack  is  not  used  in  this  mode.  In 
effect,  a  separate  string  processing  ISP  is  Incorporated 
in  the  Pc. 

5  Multiprocessing.  A  B  5000  can  have  two  Pc's. 

A  command  structure  for  complex  information  processing 

The  IPL-VI  (Chap.  30)  is  discussed  in  Part  4,  Sec.  4  page  348 
as  a  language-based  processor. 

Microprogrammed  Implementation  of  EULER 
on  IBM  System/ 360 

EULER  (Chap.  32)  is  discussed  m  Part  4,  Sec.  4  page  348  as 
a  microprogrammed,  language-based  processor. 


Chapter  21 

Design  of  an  arithmetic  unit 
incorporating  a  nesting  store^ 


R.  H.  Allmark  /  J.  R.  Lucking 

Summary  This  paper  describes  the  arithmetic  unit  of  a  computer  whose 
order  code  is  based  on  the  Reverse  Polislt  algebraic  notation.  The  order 
code  has  been  realised  by  causing  the  arithmetic  unit  to  operate  on  data 
stored  in  the  most  accessible  registers  of  a  nesting  store:  these  registers 
are  of  the  transistor  flip-flop  type  but  are  backed  up  by  sixteen  fast  magnetic 
core  registers.  The  fimctions  are  performed  as  micro-programmes  of  trans- 
fers between  the  registers  in  the  arithmetic  unit,  and  the  necessary  arrange- 
ment of  transfer  paths,  logical  gates  and  arithmetic  circuits  is  described. 
The  number  system  is  binary,  using  the  two's-complenient  representation 
of  negative  numbers.  Automatic  floating-point  operations  are  included 
which  use  an  autonomous  unit  to  perform  the  shifts  required. 

Introduction 

The  arithmetic  unit  of  a  general  purpose  digital  computer  contains 
circuits  to  perform  at  least  the  basic  operations  of  addition,  sub- 
traction, multiplication  and  division.  In  many  machines  it  is  possi- 
ble to  use  some  of  the  registers  in  the  arithmetic  unit  as  temporary 
storage  for  the  partial  results  arising  during  a  calculation;  thus 
the  accumulator  of  a  one-address  machine  is  used  to  store  the 
result  of  the  last  arithmetic  operation.  The  arithmetic  unit  de- 
scribed in  this  paper  uses  a  nesting  store,  operating  on  the  last- 
in-first-out  principle,  for  the  storage  of  its  data  and  partial  results, 
The  nesting  store  consists  of  a  stack  of  cells,  of  which  only  the 
most  accessible  supply  data  to  the  arithmetic  unit,  the  results  are 
automatically  returned  to  the  most  acces.sible  cells  and  the  original 
operands  erased,  less  accessible  information  being  moved  into  the 
cells  made  vacant  by  the  operation. 

The  computer  and  its  order  code 

The  arithmetic  unit  is  part  of  a  general  purpose  synchronous 
system,  working  in  the  parallel  mode,  with  main  core  storage  of 
(up  to)  32,  768  48-bit  words,  and  provision  for  the  time  sharing  of 
up  to  4  programmes.  The  order  code  of  the  computer  is  based 

iProc.  IFIP  Congr.  62,  pp.  694-698,  1962. 


on  the  Reverse  Polish  algebraic  notation,  and  contains  four  groups 
of  operations: 

a    Transfers  between  the  arithmetic  unit  and  the  main  store. 

h    Arithmetic,  logical  and  manipulative  fimctions  on  data  in 
the  nesting  store. 

c    Conditional  and  unconditional  jump  instructions  used  to 
interrupt  the  normal  sequencing  of  instructions. 

d    Instructions  for  controlling  the  operation  of  the  various 
peripheral  devices  which  may  be  attached  to  the  machine. 

Main  store  transfers  include  instructions  for  transferring  half 
and  full-length  words  to  the  most  accessible  cell  of  the  nesting 
store,  information  already  in  the  stack  being  retained  by  transfer 
to  the  less  accessible  cells.  The  contents  of  the  most  accessible 
cell  of  the  stack  may  be  stored  in  the  main  store;  they  are  then 
automatically  erased  from  the  stack  while  information  is  moved 
from  the  less  accessible  cells  to  a  more  accessible  position. 

Arithmetic  operations  also  feature  the  transfer  of  data  in  the 
nesting  store  so  that  the  operands  are  destroyed,  the  results  are 
left  in  the  most  accessible  cell  (or  cells),  and  data  not  involved 
in  the  operation  are  moved  to  fill  any  vacated  cells. 

Thus  the  programme  for  evaluating 

/  =  (a  -  b)/{c  +  de) 
may  be  written: 

fetch  a, 
fetch  h, 

subtract  (forming  a  —  h  in  the  most  accessible  cell 

and  erasing  both  a  and  b  from  the  stack), 
fetch  rf, 
fetch  e, 

multiply  (forming  de  in  the  most  accessible  cell, 
erasing  d  and  e,  and  thus  leaving  a  —  /;  in  the 
second  most  accessible  cell), 


262 


Chapter  21  |  Design  of  an  arithmetic  unit  incorporating  a  nesting  store  263 


fetch  c, 

add  (forming  c  +  de) 
divide  (forming  /), 

store  as  f(leaving  the  nesting  store  in  the  same  state 
as  before  the  fetch  a  instruction). 

For  instmctions,  the  48-bit  word  has  Vieen  divided  into  6  sylla- 
bles of  eight  bits  each,  and  these  are  then  treated  as  a  continuous 
sequence  of  variable  length  instructions.  Arithmetic  operations  are 
specified  bv  single  syllable  instmctions,  but  main  store  transfers 
require  three  syllables  to  accommodate  both  the  address  and  the 
address  modifying  information  of  the  word  to  which  they  refer; 
jump  instructions  also  have  three  svllables.  Two-svllable  instruc- 
tions include  the  peripheral  transfers,  and  instructions  for  process- 
ing address  modifiers  and  performing  shifts.  The  first  syllable  of 
every  instniction  contains  two  bits  whose  values  specify  the  length 
of  the  instniction;  the  redundant  ca.se  being  used  to  differentiate 
between  main  store  transfers  and  jump  instructions.  The  first  s\  1- 
lable  of  an  instruction  contains  enovigh  information  to  specify  anv 
arithmetic  unit  operation  required;  thus  in  the  machine,  each 
instruction  is  treated  by  two  controls;  the  first  or  ^tore  Control 
organising  the  fetching  and  storing  of  information  in  advance  of 
the  second  or  Arithmetic  Unit  Control  which  completes  the  in- 
stniction on  the  information  in  the  first  syllable. 

Range  of  functions 

The  allocation  of  bits  to  the  instructions  described  above  allows 
64  possible  fiuictions,  of  which  59  are  used  to  specif\'  the  wide 
range  of  operations  needed  in  a  general  purpose  computer. 

As  well  as  the  normal  single-length  fixed-point  arithmetic  oper- 
ations, fimctions  have  been  provided  for  the  addition  and  subtrac- 
tion of  double-length  numbers.  These  simplify  the  programming 
of  multi-length  operations  as  well  as  giving  increased  accuracy. 
For  normal  scientific  and  engineering  calculations  automatic  float- 
ing-point facilities  are  available.  A  single  length  word  ma\  repre- 
sent a  floating-point  number  with  a  40-bit  fractional  part  f.  and 
an  8-bit  characteristic  c:  the  value  of  the  number  is  then  f2'"^'-*'. 
The  fractional  part  is  limited  to  the  range  —  1  <  /  <  —  '4,  or 
1  >  /  >  Vj,  or  /  =  0  when  c  is  also  zero.  All  floating-point  opera- 
tions assume  that  operands  are  in  this  standard  form  and  give 
correctly  rounded  results  in  standard  form.  Functions  for  the  addi- 
tion and  subtraction  of  double-length  floating-point  numbers  have 
been  provided,  as  these  give  increased  accuracy  and  stabilit\'  in 
many  matri.x  operations. 


An  increase  in  operating  speed  and  a  saving  of  instructions  are 
effected  by  the  use  of  instructions  which  re-order  the  position  of 
information  in  the  most  accessible  cells  of  the  nesting  store,  in- 
cluding reversing  and  cycling  operations.  The  normal  logical  oper- 
ations are  provided. 

.Ml  arithmetic  operations  in  the  arithmetic  unit  are  carried  out 
on  binary  numbers  using  the  two  s-complement  notation  for  nega- 
tive numbers;  instructions  being  provided  for  the  conversion  to 
and  from  binary  of  information  stored  as  6-bit  characters  in  other 
radix  systems.  For  the  convenience  of  the  programmer,  double- 
length  numbers  are  stored  in  the  arithmetic  unit  with  their  more 
significant  half  in  a  more  accessible  cell;  the  sign  of  the  less  sig- 
nificant half  is  ignored  and  is  set  positive  after  all  double-length 
operations. 

The  nesting  store 

.Mthough  the  concept  of  a  nesting  store  is  similar  to  that  of  a  rifle 
magazine  where  the  addition  of  a  cartridge  displaces  those  already 
there,  movement  of  information  only  occurs  in  the  three  most 
accessible  cells  of  the  nesting  store,  which  are  transistor  flip-flop 
registers  forming  part  of  the  arithmetic  unit.  The  less  accessible 
cells  are  core  registers  which  are  addressed  in  a  sequential  manner 
by  a  reversible  counter.  Reading  from  these  cores  reduces  the 
count  by  one,  thus  selecting  the  next  word;  the  read-out  is  de- 
structive so  that  the  cores  are  in  the  correct  state  for  a  subsequent 
writing  operation,  which  is  the  reverse  of  a  read.  The  access  time 
of  the  cores  is  reduced  bv  providing  separate  counters  and  reading 
and  writing  mechanisms  for  the  odd  and  even  numbered  rows  of 
cores;  thus  when  reading  or  writing  from  odd  rows  the  addressing 
mechanism  for  the  next  even  row  is  set,  so  that  it  is  available  for 
immediate  use.  Thus  with  a  simple  one  core  per  bit  system  suc- 
cessive reads  can  be  made  at  1  ^sec  intervals  and  writes  at  2  fisec 
inter\'als;  as  these  operations  are  performed  in  parallel  with  the 
functioning  of  the  arithmetic  unit,  their  times  do  not  increa.se  the 
time  required  to  complete  the  functions. 

The  arithmetic  unit 

.\s  shown  in  Fig.  1,  there  are  six  full  length  transistor  flip-flop 
registers  in  the  arithmetic  unit;  there  are  also  two  8-bit  registers 
used  when  performing  floating-point  operations.  The  main  facili- 
ties associated  with  these  registers  are  as  follows. 

W2  and  \V3  are  the  three  most  accessible  cells  of  the 
nesting  store;  transfers  to  the  core  part  of  the  nesting  store,  being 


264  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  5  j  Processors  with  stack  memories  (zero  addresses  per  instruction) 


MAIN  TRANSFERS 


NESTING 
STORE 
CORE  REGISTERS 


TO  STORE  CONTROL 


AUXILIARY  TRANSFERS  AND  SHIFTS 


RIGHT  SHIFTS  OF 
0,1,2,5,8,  OR  -8 


LEFT  SHIFTS  OF 
0,1,2.5.8  OR  -8 


[CDftMTS)!*— I  SWITCH  [*- 


RIGHT  SHIFTS  OF 
0.1,2.3,8  OR-8 


*1CCB6ITS)| 


STANDARDISATION 
AND 

CONVERSION  LOGIC 


A_U_CONTROL 
PULSES 


SHIFT 
iJCONTROL 


LEFT  SHIj=TS  OF 
O.I.2.5,8j^OR  -B 


 I 

CHARACTERISTIC  MODIFIER 


Fig.  1.  Block  diagram  of  the  arithmetic  unit.  Full  lines  represent  infor- 
mation transfers;  dotted  lines  represent  control  pulses.  All  registers  are 
48-bits  long  unless  otherwise  stated. 


made  via  \V3.  Wl  and  ^2,  together  with  Bl  and  B2,  form  a 
double-length  shifting  register  which  may  be  used  as  two  inde- 
pendent single-length  shifting  registers. 

Bl  and  B2  are  the  inputs  to  the  48-bit  adder  whose  output  may 
be  routed  to  Wl,  W2,  or  to  the  characteristic  difference  register 
CD. 

The  adder  contains  13  carry-skip  stages  which  reduce  the  carry 
propagation  time  to  a  maximum  of  150  nsec.  Subtraction  is  per- 


formed by  adding  the  minuend's  complement  to  the  subtrahend 
with  a  carry  inserted  into  the  right-most  adder  stage. 

Nb  acts  as  a  buffer  between  store  control  and  the  arithmetic 
unit,  and  together  with  Bl  and  B2,  is  used  in  nearly  every  fimction. 

Arithmetic  unit  control  interprets  each  instruction  as  a  se- 
quence of  timed  pulses  along  lines  which  activate  the  various 
transfers  etc.,  between  the  registers.  The  sequences  have  been 
constructed  so  that  many  operations  are  performed  simultaneously, 
reducing  the  overall  time  to  a  minimum;  thus  the  function  sin- 
gle-length fixed-point  add  is  performed  by: 

/  Transferring  Wl,  VV2,  W3  to  B2,  Bl  and  Nb  respectively, 
simultaneously  commencing  a  read  from  the  nesting  store, 
clearing  the  carrv  inserted  into  the  right-most  adder  stage 
and  switching  the  adder's  output  to  Wl. 

a    .adding  and  simultaneously  transferring  Nb  to  W2. 

Each  step  takes  0.5  /isec  and  by  the  end  of  the  last  step,  W3 
has  been  refilled  from  the  core  nesting  store. 

To  speed  up  multiplication  and  division,  these  fimctions  are 
carried  out  in  a  separate  unit  employing  the  stored  carry  principle, 
but  the  results  are  finally  assimilated  within  the  arithmetic  unit. 

A  similar  arithmetic  unit  operating  only  on  single-length  num- 
bers could  be  designed  using  only  four  full-length  registers.  At  least 
five  registers  are  required  to  perform  the  function  which  Inter- 
changes the  contents  of  the  two  most  accessible  cells  in  the  nesting 
store  with  those  of  the  next  most  accessible  pair.  The  sixth  register 
enables  all  double-length  arithmetic  operations  to  be  performed 
without  writing  information  back  into  the  nesting  during  the  func- 
tion; this  would  have  complicated  the  sequences  and  increased 
the  time  for  the  functions. 

When  determining  the  arrangement  of  transfer  paths  between 
the  various  registers,  it  was  found  sufficient  to  consider  only  the 
double-length  functions  which  required  complicated  or  lengthy 
sequences;  in  particular  the  function  for  adding  two  double-length 
floating  numbers  had  great  influence. 

An  overflow  indication  is  set  on  fixed-point  addition  and  sub- 
traction if  the  sign  of  the  result  differs  from  that  expected,  and 
on  floating-point  operations  if  the  characteristic  exceeds  the 
maximum  allowable;  shifting  may  also  cause  overflow. 

Shift  control 

Shifting  operations  are  effected  by  transfers  between  Wl  (and/or 
W2)  and  Bl  (and/or  B2),  and  back  again.  The  shift  transfer  paths 
from  the  W  to  the  B  registers  provide  right  shifts  of  0,  1,  2,  5 


Chapter  21  |  Design  of  an  arithmetic  unit  incorporating  a  nesting  store  265 


or  8  places,  and  a  left  shift  of  8  places;  the  paths  from  the  B  to 
the  \V  registers  provide  the  same  shifts  in  the  reverse  direction. 
The  two  sets  of  shift  paths  are  used  alternately,  those  from  the 
W  registers  being  used  first;  all  shifts  are  terminated  using  a  path 
into  the  W  registers.  Shifts  of  a  large  number  of  places  are  accom- 
plished by  a  series  of  shifts  of  eight  places  in  the  appropriate 
direction  until  the  number  of  places  remaining  is  less  than  eight; 
if  necessary  the  number  is  then  transferred  back  into  the  \V  regis- 
ters: the  remaining  shifts,  or  the  whole  shift  if  the  number  of  places 
is  less  than  eight,  is  then  completed  by  a  transfer  to  the  B  registers 
and  back  again  using  two  appropriate  paths.  With  the  shifts  avail- 
able, extension  of  the  B  registers  by  two  bits  at  the  right-most 
end  enables  any  shift  to  be  performed  without  loss  of  accuracy. 
In  double-length  arithmetic  shifts,  the  sign  digit  of  the  less  sig- 
nificant word  is  by-passed.  When  a  shift  is  to  be  performed,  the 
nimiber  of  places  and  the  type  of  shift  are  transferred  into  a  semi- 
autonomous  unit,  called  the  shift  control,  which  is  then  supplied 
with  a  string  of  command  pulses  by  the  arithmetic  unit  control; 
shift  control  then  re-routes  these  pulses  to  perform  the  transfers 
necessary  to  obtain  the  shift. 

When  performing  floating-point  addition  and  subtraction,  shifts 
are  required  to  equalize  the  characteristics  of  the  two  numbers; 
the  amount  of  shift  is  calculated  by  a  modified  subtraction,  oper- 
ating on  the  characteristic  positions  of  the  two  numbers,  .\fter  the 
addition,  the  shift  required  to  restore  the  result  to  standard  form 
is  determined  by  logical  circuits  which  interpret  the  pattern  of 
bits  in  W'l  into  shift  information.  The  number  of  shifts  performed 
during  this  standardising  operation  is  made  available  to  the  arith- 
metic unit  control  for  use  in  forming  the  correct  characteristic 
of  the  result. 

The  character  conversion  operations  to,  and  from,  binary  are 
accomplished  by  shift  control,  using  a  method  involving  successive 
shifting  of  the  character  word,  and  adding  or  subtracting  portions 
of  the  radix  word. 

Examples  of  sequences 

To  illustrate  the  working  of  the  arithmetic  unit,  two  sequences 
are  described. 

a  —  D,  (i.e.  subtract  the  double-length  fked-point  number  in 
Wl  and  \V2  from  the  munber  in  W3  and  the  most  accessible 
core  register  of  the  nesting  store). 

i    Transfer  VVT,  VV2,  W3  to  B2,  Bl  and  \h  respective!)-, 
simultaneously  reading  from  the  core  nesting  store. 
//    A  dummy  pulse. 


Hi  Transfer  the  complement  of  \V2  to  B2  (but  setting  the 
sign  of  B2  positive),  transfer  W3  directly  to  Bl  (VV3 
has  by  now  been  filled  with  fresh  data),  switch  the 
adder's  output  to  \V'2,  inserting  a  carry  into  the  right- 
most adder  stage,  and  read  from  the  nesting  store. 

if  .\dd. 

V  Transfer  the  complement  of  VVl  to  Bl  and  Nb  to  B2, 
switch  the  adder's  output  to  Wl  and  insert  a  carry 
into  the  right-most  adder  stage  if  U'2  is  negative. 

fi    .\dcl.  simultaneously  clearing  the  sign  of  W2. 

h    +F  (i.e.  add  the  two  single-length  floating  numbers  in  Wl 
and  U^). 

i    Transfer  the  complement  of  WI  to  Bl,  transfer  ^"2 
to  B2  and  switch  the  adder's  output  to  register  CD. 
a    Store  the  characteristic  of  \VT  in  the  eight-bit  register 
C  and  add. 

ill  Clear  the  characteristic  positions  of  Wl,  simultane- 
ously transferring  CD  into  the  shift  number  register 
in  shift  control.  This  latter  operation  is  such  that  the 
shift  register  contains  minus  the  difference  in  charac- 
teristics. 

it  Clear  the  characteristic  of  \V'2,  and  if  VVT  is  about 
to  be  shifted,  determined  by  the  sign  digit  of  CD, 
replace  the  contents  of  C  by  the  characteristic  of  B2; 
thus  C  contains  the  larger  characteristic. 

V  Supply  control  pulses  to  shift  control  and  thus  perform 
the  required  right-shift  of  eight  Wl  or  U'2. 

li  Having  completed  the  shift,  transfer  Wl,  \V2  and  W3 
to  B2,  Bl  and  .V/)  respectivelv ,  simultaneously  switch- 
ing the  adder  s  output  to  \\T,  clearing  the  carry  into 
the  right-most  adder  stage  and  reading  from  the  core- 
nesting  store. 

til  .\dd  the  fractional  parts,  simultaneously  transferring 
Xb  to  W2. 

t  iii  Supply  control  pulses  to  shift  control  so  as  to  cause 
it  to  enter  the  standardization  procedure  and  perform 
the  shifts  required. 
ix  Store  the  complement  of  the  number  of  left-shifts 
performed  in  (viii)  in  the  characteristic  position  of  B2. 
transfer  C  to  the  characteristic  position  of  Bl.  switch 
the  adder  to  Wl. 
.V  Perform  a  special  add  operation  which  only  affects 
the  characteristic  positions  of  Wl. 

The  sum  is  thus  formed  in  WL  Rounding  the  answer  is  carried 
out  using  two  special  control  pulses  which  complete  all  floating- 
point operations,  these  call  up  logic  to  deal  with  the  cases  when 
the  rounding  operation  necessitates  re-standardization  of  the  re- 
sult. 


266  Part  3  j  The  instruction-set  processor  level:  variations  in  the  processor 


Section  5     Processors  viiith  stack  memories  (zero  addresses  per  instruction) 


Conclusions 

The  advantages  of  a  machine  incorporating  a  nesting  store  in  the 
arithmetic  unit  are: — 

!  The  machine  is  simple  to  programme  using  the  machine 
language. 

ii  Programmes  are  faster,  since  many  main  store  transfers  are 
eliminated,  and  the  access  time  of  the  nesting  store  is 
virtually  zero.  They  are  more  compact  because  less  infor- 
mation is  required  to  specify  many  instructions. 


Hi  As  the  operation  of  the  arithmetic  unit  is  largely  inde- 
pendent of  the  main  store,  their  controls  may  readily  be 
separated.  This  allows  store  control  to  process  instructions 
whilst  the  arithmetic  unit  control  processes  a  prior  instruc- 
tion, thereby  leading  to  faster  execution  of  the  programme. 

The  main  disadvantage  is  an  increase  in  the  order  of  complexity 
involved. 

References 

AllmR62:  DaviG60;  Ha!eA62 


r 


Chapter  22 

Design  of  the  B  5000  system^ 

William  Lonergan  /  Paul  King 

Computing  systems  have  conventionally  been  designed  via  the 
'hardware'  route.  Subsequent  to  design,  these  systems  have  been 
handed  over  to  programming  systems  people  for  the  development 
of  a  programming  package  to  facilitate  the  use  of  the  hardware. 
In  contrast  to  this,  the  B  5000  system  was  designed  from  the  start 
as  a  total  hardware-software  system.  The  assumption  was  made 
that  higher  level  programming  languages,  such  as  ALGOL,  should 
be  u.sed  to  the  virtual  exclusion  of  machine  language  programming, 
and  that  the  system  should  largely  be  used  to  control  its  own 
operation.  A  hardware-free  notation  was  utilized  to  design  a  proc- 
essor with  the  desired  word  and  symbol  manipidative  capabilities. 
Subsequently  this  model  was  translated  into  hardware  specifica- 
tions at  which  time  cost  constraints  were  considered. 

Design  objectives 

The  fimdamental  design  objective  of  the  B  5000  system  was  the 
reduction  of  total  problem  through-put  time.  A  second  major 
objective  was  facilitation  of  changes  both  in  programs  and  system 
configurations.  Toward  these  objectives  the  following  aspects  of 
the  total  computer  utilization  problem  were  considered: 

Statement  of  problems  in  higher-level  machine-independent 
languages;  efiiciencv  of  compilation  of  machine  language;  speed  of 
compilation  of  machine  language;  program  debugging  in  higher- 
level  languages;  problem  set-up  and  load  time;  efficiency  of 
system  operation;  ease  of  maintaining  and  making  changes  in 
existing  programs,  and  ease  of  reprogramming  when  changes  are 
made  in  a  system  configuration. 

Design  criteria 

Early  in  the  design  phase  of  the  B  5000  system  the  following 
principles  were  established  and  adopted: 

Program  should  be  independent  of  its  location  and  unmodified 
as  stored  at  object  time;  data  should  be  independent  of  its  location; 
addressing  of  memory  within  a  program  should  take  advantage 
of  contextual  addressing  schemes  to  reduce  redundancy;  provisions 

^Datamation,  vol.  7,  no.  .5,  pp.  28-32,  May,  196L 


should  be  made  for  the  generalized  handling  of  indexing  and 
subroutines;  a  full  complement  of  logical,  relational  and  control 
operators  should  be  provided  to  enable  efficient  translation  of 
higher-level  source  languages  such  as  .ALGOL  and  COBOL;  pro- 
gram syntax  should  permit  an  almost  mechanical  translation  from 
source  languages  into  efficient  machine  code;  facilities  should  be 
provided  to  permit  the  system  to  largely  control  its  own  operation; 
input-output  operations  should  be  divorced  from  processing  and 
shouklbc  handled  by  an  operating  system;  mvilti-programming  and 
true  parallel  processing  (reejuires  multiple  processors)  should  be 
facilitated,  and  changes  in  system  configuration  (within  certain 
broad  limitations)  should  not  require  reprogramming. 

System  organization 

The  B  ,5000  system  achieves  its  unique  physical  and  operational 
modularity  through  the  u.se  of  electronic  switches  which  function 
logically  like  telephone  crossbar  switches.  Figure  1  depicts  the 
basic  organization  of  the  system  as  well  as  showing  a  maximum 
system. 

Master  control  program 

.\  master  control  program  will  be  provided  with  the  B  5000  system. 
It  will  be  stored  on  a  portion  of  the  magnetic  drum.  During  normal 
operations,  a  small  portion  of  the  .MCP  will  be  contained  in  core 
memory.  This  portion  will  handle  a  large  percentage  of  recurrent 
system  operations.  Other  segments  of  the  MCP  will  be  called  in 
from  the  magnetic  drum,  from  time  to  time,  as  they  are  required 
to  handle  less  frequently-occurring  events,  or  system  situations. 
\\'henever  the  svstem  is  executing  the  master  control  program, 
it  is  said  to  be  in  the  Control  State.  All  entries  to  the  Control 
State  are  made  via  'interrupts.'  .\  special  operation  is  provided, 
which  can  only  be  executed  when  the  system  is  in  the  Control 
State,  to  permit  control  to  return  to  the  object  program  it  was 
executing  at  the  time  the  'internipt"  occurred. 

The  follo\\  ing  are  a  few  typical  occurrences  which  cause  an 
automatic  interrupt  in  the  system:  .\n  input-output  channel  is 


267 


268  Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor  Section  5     Processors  with  stack  memories  (zero  addresses  per  instruction) 


available,  an  input-output  operation  has  been  completed  or  an 
indexing  operation  was  attempted  which  violated  the  storage 
protection  features  built  into  the  system. 

In  addition  to  processing  interrupt  conditions,  the  master  con- 
trol program  handles  fundamental  parts  of  the  total  system  opera- 
tion such  as  the  initiation  of  all  input-output  operations,  tanking 
of  input-output  areas  when  required,  file  control,  allocation  of 
memory,  scheduling  of  jobs  (priority  ratings,  system  requirements 
of  each  object  program,  and  the  present  system  configuration  are 
considered),  maintenance  of  an  operations  log  and  maintenance 
of  a  system  description. 

Operating  modes 

The  B  5000  can  either  operate  with  fixed-length  words  or  with 
variable-length  fields.  These  two  modes  of  operation  are  called  the 


word  mode  and  the  character  mode.  For  certain  operations,  a 
processor  operating  on  words  is  most  desirable  and  for  other  opera- 
tions, a  variable  field  length  mode  of  operation  is  most  desirable. 
By  combining  both  abilities  in  one  processor,  a  processor  can 
operate  in  the  mode  most  desirable  for  the  operation  at  hand.  In 
a  B  5000  system,  it  is  even  possible  for  one  processor  to  be  operat- 
ing in  the  word  mode  and  the  other  in  the  character  mode. 

When  operating  in  the  word  mode,  a  standard  format  for  the 
data  word  is  used  as  illustrated  in  Fig.  2. 

Note  that  the  standard  word  is  an  octal  floating  point  word. 
However,  the  mantissa  is  treated  as  an  integer  rather  than  as  a 
fraction  (heretofore  the  reverse  has  been  common  practice).  This 
provides  two  benefits:  first,  an  integer  has  the  same  internal  repre- 
sentation as  its  unnormalized  floating  point  correspondent;  and, 
second,  the  range  of  numbers  that  can  be  expressed,  rather  than 
being  from  8+**-*  to  8'''^,  is  8+"^  to  8'^^.  The  first  feature  eliminates 


Chapter  22  \  Design  of  the  B  5000  system  269 


S 

EXPO 

S 

F 

E 

NENT 

0 

Integer  Part 

F-Flag  (1  bit)  SO-Sign  of  Operand  (1  bit) 

SE-Sign  of  Exponent  (1  bit)  Integer  Part  (39  bits) 
Exponent  (6  bits) 

Fig.  2.  Data  word  —  word  mode. 


the  need  for  fixed-to-floating  point  conversion;  integers  and  floating 
point  nuniliers  can  be  mixed  in  arithmetic  calculations.  The  second 
expands  the  range  where  trouble  with  range  is  most  often  en- 
countered, namely,  in  numbers  with  extremely  large  magnitude. 

The  flag  serves  a  dual  purpose.  The  function  of  the  flag  depends 
on  how  the  program  references  the  data  word.  If  the  data  word 
is  a  single  variable  and  not  an  element  of  an  array,  the  flag  identi- 
fies the  word  as  being  operand,  that  is.  a  data  word.  If  the  word 
is  an  element  of  an  array,  the  flag  may  be  used  to  identify  this 
particular  element  as  an  element  of  data  which  is  not  to  be  proc- 
essed by  the  normal  program  (for  example,  a  boundary  point  in 
mesh  calculations). 

When  operating  in  the  character  mode,  each  data  word  consists 
of  eight  alphanumeric  characters  as  illustrated  in  Fig.  .3.  Programs 
in  the  character  mode  can  address  any  character  in  a  word.  Fields 
can  start  at  any  position  in  a  word.  A  processor  in  a  single  opera- 
tion can  operate  on  fields  of  any  length  up  to  6.3  characters  long; 
operations  on  fields  of  greater  length  can  easily  be  programmed. 
For  example,  two  57  character  fields  could  be  compared  in  a  single 
operation. 

There  are  two  instances  when  the  character  mode  operates  with 
words  of  the  type  used  in  the  word  mode.  Operations  are  provided 
in  the  character  mode  for  converting  numeric  information  in  the 
alphanumeric  representation  to  the  standard  word  type  of  the 
word  mode  and  vice  versa.  In  both  of  these  instances,  the  length 
of  the  alphanumeric  fields  being  converted  to  or  from  the  word 
mode  type  of  word  can  be  no  greater  than  eight  characters  long. 
Again,  conversion  of  fields  of  greater  length  can  easily  be  pro- 
grammed. 

The  purpose  of  the  word  mode  is  to  provide  the  advantages 
of  high-speed  parallel  operations,  floating-point  abilities  and  the 
inherent  information  density  possible  in  a  binar\'  machine.  In  the 
first  case,  it  is  economically  feasible  to  provide  parallel  operations 
in  a  word  machine;  the  cost  of  parallel  operations  on  variable 
length  fields  would  be  prohibitive.  In  the  last  case,  a  given  size 
memory  can  contain  over  twenty  percent  more  numeric  informa- 
tion if  that  information  is  expressed  in  binary  rather  than  binary- 


coded  decimal,  and  over  eighty  percent  more  information  than 
can  be  expressed  in  six-bit  alphanumeric  representation. 

The  purpose  of  the  character  mode  is  to  provide  editing,  scan- 
ning, comparison  and  data  manipulative  abilities  (although  addi- 
tion and  subtraction  are  also  provided  I.  The  type  of  editing  facili- 
ties provided  obviate  the  need  for  the  artificial  "add-shift-extract- 
store"  type  of  editing.  For  example,  operations  are  provided  for 
generalized  insertion  of  editing  symbols  (such  as  blanks,  decimal 
points,  floating  dollar  signs,  etc.)  and  for  the  substitution  or  sup- 
pression of  any  unwanted  characters.  For  those  interested  in  the 
new  area  of  Information  Processing  Languages,  the  character  mode 
is  particularh  well  suited  to  list  structures. 

Program  organization 

Programs  in  the  B  50(K)  are  composed  of  strings  of  syllables.  A 
syllable  is  the  basic  imit  of  the  program  and  is  twelve  bits  in 
length.  The  term  "syllable"  is  used  rather  than  instruction  to 
distinguish  it  from  conventional  single-address  or  multi-address 
instmctions.  Each  program  word  contains  four  syllables  and  they 
are  executed  secjuentiallv  in  a  left-to-right  order  within  the  pro- 
gram word,  and  sequentially  bv  word.  Branching  is  allowed  to  any 
syllable  within  a  word.  Before  delving  into  some  of  the  details 
of  the  internal  operation  of  the  B  ,50(K)  processor,  it  is  necessary 
to  discuss  stacks,  Polish  notation,  and  the  Program  Reference 
Table. 

The  stack 

The  internal  organization  of  single-address  computers  forces  the 
wasting  of  both  programming  and  running  time  for  the  storage 
and  recall  of  the  intermediate  results  in  the  sequence  of  compu- 
tation. The  data  must  be  placed  into  the  proper  registers  and 
memory  cells  before  the  operation  can  be  executed,  and  their 
contents  must  often  be  completely  rearranged  before  the  next 
operation  can  be  performed.  Multi-address  computers  are  con- 
stnicted  to  make  the  execution  of  a  few  selected  operations  more 
efficient,  but  at  the  e.xpense  of  building  inefficiencies  into  all  the 
rest.  .Automatic  programming  aids  attack  this  problem  indirectly: 
they  relieve  the  programmer  of  the  need  to  laboriously  code  his 


First 

Second 

Third 

Fourth 

Fifth 

Sixth 

Seventh 

Eighth 

Char- 

Char- 

Char- 

Char- 

Char- 

Char- 

Char- 

Char- 

acter 

acter 

acter 

acter 

acter 

acter 

acter 

acter 

Fig.  3.  Data  word  —  character  mode. 


Part  3  I  The  instruction  set  processor  level:  variations  In  the  processor 


Section  5  |  Processors  with  stack  memories  (zero  addresses  per  Instruction) 


way  around  machine  design,  but  they  still  must  provide  object 
coding  to  accomplish  the  storage  and  recall  functions.  In  brief, 
conventionally  designed  computers,  with  or  without  automatic 
programming  aids,  require  the  wasteful  expenditure  of  program- 
ming effort,  memory  capacity,  and  running  time  to  overcome  the 
limitations  of  their  internal  organization. 

The  problem  is  attacked  directly  in  the  B  5000  by  incorporation 
of  a  "pushdown"  stack,  which  completely  eliminates  the  need  for 
instructions  (coded  or  compiled)  to  store  or  recall  intermediate 
re.sults. 

In  a  B  5000  processor,  the  stack  is  composed  of  a  pair  of  regis- 
ters, the  A  and  B  registers,  and  a  memory  area.  As  operands  are 
picked  up  by  the  programs,  they  are  placed  in  the  A  register.  If 
the  A  register  already  contains  a  word  of  information,  that  word 
is  transferred  to  the  B  register  prior  to  loading  the  operand  into 
the  A  register.  If  the  B  register  is  also  occupied  by  information, 
then  the  word  in  B  is  stored  in  a  memory  area  defined  by  an 
address  register  S.  Then  the  word  in  A  can  be  transferred  to  B 
and  the  operand  brought  into  the  A  register.  The  new  word  coming 
into  the  stack  has  pushed  down  the  information  previously  held 
in  the  registers.  As  each  pushdown  occurs,  the  address  in  the  S 
register  is  automatically  increased  by  one.  The  information  con- 
tained in  the  registers  is  the  last  information  entered  into  the  stack; 
the  stack  operates  on  a  "last  in-first  out  '  principle.  As  information 
is  operated  on  in  the  stack,  operands  are  eliminated  from  the  stack 
and  results  of  operations  are  returned  to  the  stack.  As  information 
in  the  stack  is  used  up  by  operations  being  performed,  it  is  possible 
to  cause  "pushups,"  i.e.,  a  word  is  brought  from  the  memory  area 
addres.sed  by  the  S  register,  and  the  address  in  the  S  register  is 
decreased  by  one. 

To  eliminate  unnecessary  pushdowns  and  pushups,  the  A  and 
B  registers  both  have  indicators  used  for  remembering  whether 
the  registers  contain  information  or  are  empty.  When  an  operand 
is  to  be  placed  in  the  stack  and  either  of  the  registers  is  empty, 
no  pushdown  into  memory  occurs.  Also,  when  an  operation  leaves 
one  or  both  of  the  registers  empty,  no  automatic  pushup  occurs. 

Polish  notation 

The  Polish  logician,  J.  Lukasiewicz,  developed  a  notation  which 
allows  the  writing  of  algebraic  or  logical  expressions  which  do  not 
require  grouping  symbols  and  operator  precedence  conventions. 
For  example,  parentheses  are  necessary  as  grouping  symbols  in 
the  expression  A(B-|-C)  to  convey  the  desired  interpretation  of  the 
expression.  In  the  expression  A-l-B/C,  the  normal  interpretation 
is  A-|-(B/C),  rather  than  (A-fB)/C,  because  of  the  convention  that 


the  /  operator  is  of  higher  precedence  than  the  -|-  operator.  The 
right-hand  Polish  notation  used  in  the  B  5000  is  based  on  placing 
the  operators  to  the  right  of  their  operands;  A  -f  B  becomes  AB-l- 
in  Polish  notation.  A^-B-(-C  can  be  written  either  as  AB-f  C-I-, 
or  as  ABC  -|-  -I- .  In  the  expression  ABC  -|-  -f ,  the  first  -|-  operator 
says  to  add  the  operands  B  and  C.  The  second  +  operator  says 
to  add  A  to  the  sum  of  B  and  C.  Returning  to  the  first  examples 
above,  A(B-|-C)  can  be  written  as  BC-l-Ax  or  ABC-I-  X  in  Polish. 
The  second  example  is  written  as  BC/A+  or  ABC/-I-.  The  exten- 
sion of  Polish  notation  to  handle  equations  is  shown  in  the  follow- 
ing example: 

Conventional  notation  Z  =  A(B-C)/(D-|-E) 

Polish  notation  ABC-  xDE-f- /Z  = 

The  stack  in  use 

To  illustrate  the  fimctioning  of  the  stack,  two  simple  examples 
are  shown  in  Figs.  4  and  5.  In  the  examples,  the  letters  P,  Q  and 
R  represent  syllables  in  the  program  that  cause  the  operands  P, 
Q,  and  R  to  be  picked  up  and  placed  in  the  stack.  The  symbols 
-I-  and  X  represent  syllables  that  cause  the  add  and  multiply 
operations  to  occur.  The  two  examples  represent  difi^erent  ways 
of  writing  P(Q-|-R)  in  Polish  notation.  The  first  example  in  Fig. 
4  does  not  require  pushdowns  or  pushups.  The  second  example, 
shown  in  Fig.  5,  requires  a  pushdown  in  the  execution  of  the 
syllable  R,  and  a  pushup  in  the  execution  of  the  syllable  X .  The 
columns  in  the  table  represent  the  contents  of  the  various  registers 
after  execution  of  the  syllable  listed  in  the  first  column. 


Independence  of  addressing 

One  of  the  goals  set  in  the  design  of  the  B  5000  was  to  make  the 
programs  independent  of  the  actual  memory  locations  of  both  the 
program  itself  and  the  data,  in  order  to  provide  really  automatic 

Polish  Notation  QR  +  Px 


Syllable 
Executed 


Contents  of 


Register  A 


Q 

R 

Empty 
P 

Empty 


Register  B 


Empty 

Q 
R-i-Q 
R+Q 

P(R-fQ) 


Fig.  4 


Chapter  22  |  Design  of  the  B  5000  system  271 


Polish  Notation  PQR+  x 


Stjilahle 
Executed 

Contents  of 

Register  A 

Register  B 

Register  S 

Cell  101 

P 

P 

Empty 

100 

_ 

Q 

Q 

P 

100 

_ 

R 

Pushdown 

Empty 

Q 

101 

P 

Execute 

R 

Q 

101 

P 

+ 

Empty 

Q-R 

101 

P 

X 

Pushup 

Q-R 

P 

100 

Execute 

Empty 

P(Q-R) 

100 

Fig.  5 


program  segmentation.  Through  automatic  program  segmentation, 
it  is  possible  to  have  program  size  practically  independent  of  the 
size  of  core  memory.  The  systems  analyst  or  programmer  intending 
to  do  multi-processing  is  then  no  longer  faced  with  the  difficult 
task  of  planning  what  jobs  are  to  be  nm  together  in  order  that 
system  storage  capacities  are  not  exceeded. 

In  achieving  independence  of  addressing,  a  solution  requiring 
large  contiguous  areas  of  memory  was  not  deemed  satisfactory. 
Each  segment  of  the  program  and  each  data  area  should  be  com- 
pletely relocatable  without  modification  to  the  program.  It  is  then 
possible  to  load  all  the  segments  of  a  program  or  programs  onto 
the  drum  at  load  time  and  call  in  the  segments  to  any  available 
space  in  core  memory  as  needed  during  nm  time.  If  some  segment 
of  a  program  is  overlaid  by  a  subsequent  segment  of  a  program, 
the  segment  of  the  program  destroyed  in  core  memory  is  still 
available  on  the  dmm  to  be  called  in  again  if  needed. 

Due  to  the  very  high  program  densities  in  the  B  .5()()(),  the 
availability  of  high  capacity  drum  storage  on  every  system  and 
automatic  segmentation,  a  minimum  B  .5000  system  has  the  capa- 
city for  a  program  or  programs  equivalent  to  approximately  40.000 
to  60,000  single  address  instructions.  Of  course,  if  an  installation 
normally  ran  such  large  programs,  the  system  would  ver\'  likely 
not  be  a  minimum  system.  However,  the  installation  having  an 
occasional  need  to  run  vers'  large  programs  is  not  prevented  from 
doing  so  by  storage  capacity. 

Processing  speed  now  becomes  a  fimction  of  the  size  of  core 
memory.  If  large  programs  are  nm  in  a  s\stem  with  small  core 
memory,  time  will  be  consumed  in  recalling  program  segments 


from  dnim  to  core.  If  the  core  memory  is  expanded,  less  time  will 
be  spent  in  such  activity  and  the  program  or  programs  will  be 
speeded  up,  and  no  reprogramming  is  re(]uired. 

Program  reference  table 

The  means  of  achieving  independence  of  addressing  in  the  B  5000 
is  called  a  Program  Reference  Table  (PRT).  The  PRT  is  a  1,025 
word  relocatable  area  in  memory  used  primarily  for  storing  con- 
trol words  that  locate  data  areas  or  program  segments.  There  are 
also  control  words  for  describing  input-output  operations.  These 
control  words,  called  descriptors,  contain  the  base  address  and  size 
of  data  areas,  program  segments  and  input-output  areas.  A  descrip- 
tor specifying  an  input-output  operation  also  contains  the  desig- 
nation of  the  unit  to  be  used  and  the  type  of  operation  to  be 
performed.  Operands  may  also  be  stored  in  the  PRT,  providing 
direct  access  to  single  values  such  as  indices,  counts,  control  totals, 
etc. 

In  the  word  mode  of  the  B  5(KX),  every  item  of  data  is  con- 
sidered to  be  either  a  single  value  or  an  element  of  an  array  of 
data.  If  it  is  a  single  value,  it  will  be  obtained  directly  by  indexing 
a  descriptor  contained  in  the  PRT. 

Program  segments  are  described  by  program  descriptors.  In 
addition  to  core  base  address,  the  program  descriptor  contains  the 
location  in  drum  storage  of  the  program  segment  and  an  indication 
if  the  program  segment  is  cvirrentlv  in  core  menior\'  starting  at 
the  address  specified  in  the  descriptor.  Entr\'  to  a  program  segment 
is  made  via  its  program  descriptor  contained  in  the  PRT.  If  the 
program  segment  is  in  core  memory,  entr\'  will  be  made  to  the 
program  segment.  However,  when  entry  is  attempted  to  a  program 
segment  whose  descriptor  indicates  that  the  segment  is  not  in  core 
memor)',  automatic  entry  to  the  Master  Control  Program  will  occiu- 
and  the  desired  segment  will  then  be  brought  in  from  the  drum. 
Notice  that  in  moving  from  one  segment  to  another,  it  is  not 
necessary  to  know  whether  the  segment  to  be  entered  is  currently 
in  core  memory.  Branching  within  a  program  segment  is  self- 
relative,  i.e.,  the  distance  to  jump  either  forward  or  backward  is 
specified,  not  the  address  to  be  jumped  to. 

.\s  a  result  of  keeping  all  actual  addresses  of  data  and  program 
in  the  PRT,  the  program  itself  does  not  contain  any  addresses, 
but  only  references  to  the  PRT.  To  specify  one  of  the  1,024  posi- 
tions in  the  PRT  requires  only  10  bits  which  contributes  greatly 
to  the  high  program  density  achieved  in  the  B  5000.  Since  the 
PRT  is  relocatable,  references  to  the  PRT  contained  in  the  pro- 
gram are  to  relative  locations,  thus  completely  freeing  the  program 
from  an\  dependence  whatsoever  on  actual  memor\'  locations. 


Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  5     Processors  with  stack  memories  (zero  addresses  per  instruction) 


Word  mode  program 

The  wort]  mode  of  the  B  5000  processor  has  four  types  of  syllables. 
The  syllable  type  is  distinguished  by  the  two  high-order  bits  of 
each  12-bit  syllable.  The  types  of  syllable  and  the  identification 
bits  are: 

00—  Operator  Syllable 

01—  Literal  Syllable 

10—  Operand  Call  Syllable 

11 —  Descriptor  Call  Syllable 

The  first  of  these,  the  operator  syllable,  causes  operations  to  be 
performed.  The  remaining  ten  bits  of  the  operator  syllable  are  the 
operation  codes.  There  are  approximately  sixty  different  operations 
in  the  word  mode.  For  those  operations  requiring  an  operand  or 
operands,  the  processor  checks  for  sufficient  operands  in  the  regis- 
ters; if  they  are  not  there,  pushups  from  the  stack  in  memory  occur 
automatically. 

The  literal  syllable  is  used  for  placing  constants  in  the  stack 
to  be  used  as  operands.  The  ten  bits  of  the  literal  .syllable  are 
transferred  to  the  stack.  This  allows  the  program  to  contain  inte- 
gers less  than  1,024  as  constants. 

The  operand  call  syllable,  and  the  descriptor  call  syllable  ad- 
dress locations  in  the  program  reference  table.  The  purpose  of  the 
operand  call  syllable  is  to  place  an  operand  in  the  stack;  the 
purpose  of  the  descriptor  call  syllable  is  to  place  the  address  of 
an  operand,  a  descriptor,  in  the  stack.  There  are  four  situations 
that  arise,  depending  on  the  word  read  from  the  program  reference 
table. 

1  The  word  is  an  operand. 

2  The  word  is  a  descriptor  containing  the  address  of  the 
operand. 

3  The  word  is  a  descriptor  containing  the  base  address  of  the 
data  area  in  which  the  operand  resides. 

4  The  word  is  a  program  descriptor  containing  the  base  ad- 
dress of  a  subroutine. 

For  (1),  the  operand  call  syllable  has  completed  its  action  by 
placing  an  operand  in  the  stack.  The  descriptor  call  syllable  will 
cause  the  construction  of  a  descriptor  of  the  operand,  replacing 
the  operand  by  the  constructed  descriptor. 

For  (2),  the  operand  call  syllable  then  reads  the  operand  from 
the  cell  addressed.  The  descriptor  call  syllable  has  completed  its 
action. 


For  (.3),  indexing  of  the  descriptor  by  the  item  that  is  now  the 
second  item  in  the  stack  occurs.  For  an  operand  call  syllable,  the 
operand  is  obtained  from  the  indexed  address;  for  the  descriptor 
call  syllable,  action  is  complete  after  the  indexing. 

In  the  case  of  (4),  subroutine  entry  occurs  to  the  subroutine 
addressed.  A  word  of  the  three  previous  types  may  be  left  in  the 
registers  upon  return  from  the  subroutine,  in  which  instance  the 
actions  described  above  will  take  place,  depending  upon  the  type 
of  syllable  which  initiated  the  subroutine. 

Essentially,  the  four  types  of  action  that  occur  for  an  operand 
call  syllable  are  obtaining  an  operand  directly,  indirectly,  from 
an  array,  or  by  computation.  Sometimes  in  the  use  of  the  call 
syllables,  it  is  not  known  which  type  of  action  will  occur  for  a 
particular  syllable  when  the  program  is  created.  This  is  particu- 
larly true  for  call  syllables  in  subroutines. 

Programs  in  the  word  mode  consist  of  strings  of  syllables  which 
follow  the  rules  of  Polish  notation.  Variable  length  strings  of  call 
syllables  and  literal  syllables,  which  place  items  of  information 
in  the  stack,  are  followed  by  operator  syllables  which  perform  their 
operations  on  information  in  the  stack. 

The  indexing  features  of  the  B  5000  allow  generalized  indexing 
and  at  the  same  time  provide  complete  storage  protection.  Data 
areas  and  program  segments  of  difTerent  programs  may  be  inter- 
mingled, but  a  program  is  prevented  from  storing  outside  of  its 
data  areas.  The  method  of  indexing  allows  any  of  the  1,024  words 
of  the  program  reference  table  to  be  considered  index  registers. 
Multilevel  indexing  is  provided,  i.e.,  indices  of  arrays  can  them- 
selves be  elements  of  arrays. 

The  subroutine  control  provided  in  the  B  5000  allows  nesting 
of  subroutines — even  recursive  nesting  (a  subroutine  is  a  subrou- 
tine of  itself) — arbitrarily  deep.  Dynamic  allocation  of  storage  for 
parameter  lists  and  temporary  working  storage  simplify  the  use 
of  subroutines.  Storage  is  automatically  allocated  and  deallocated 
as  required. 

Character  mode  program 

In  the  character  mode  of  the  B  5000  Processor,  there  is  only  one 
type  of  syllable,  called  the  operator  syllable.  Program  segments 
in  the  character  mode  are  constructed  of  strings  of  these  syllables. 
The  character  mode  is  designed  to  provide  editing,  formatting, 
comparison,  and  other  forms  of  data  manipulation.  In  doing  so, 
the  processor  uses  two  areas  of  memory — the  source  and  desti- 
nation areas.  When  a  program  switches  from  word  mode  to  char- 
acter mode,  two  descriptors  containing  the  base  addresses  of  these 
areas  are  supplied.  The  source  area  or  destination  area  may  be 


Chapter  22  j  Design  of  the  B  5000  system  273 


changed  at  any  time  during  character  mode  so  that  the  program 
may  act  on  several  areas. 

The  character  mode  operator  syllable  is  split  into  two  6-bit 
parts;  the  last  part  specifies  the  operation  to  be  performed  and 
the  first  part  specifies  the  number  of  times  the  operation  is  to  be 
performed.  Operations  are  provided  for  the  transferring,  deletion, 
comparison,  and  insertion  of  characters  or  bits.  Also,  there  are 
operations  which  allow  the  repetition  of  syllable  strings.  This  is 
quite  useful  for  complex  table  look-up  operations  and  for  editing 
information  which  contains  repeated  patterns. 


Conclusion 

The  Burroughs  B  .5000  system  has  been  designed  as  an  integrated 
hardware-software  package  which  offers  such  benefits  as  savings 
in  the  memory  space  required  to  store  equivalent  object  programs; 
multi-processing  and  parallel  processing;  and  running  identical 
programs  on  .systems  with  different  size  memories  and  different 
system  configurations  with  no  loss  in  individual  system  efficiency. 

References 

LoncWfil,  HartKfil;  BockR6.3;  CarlCft3;  MaheRGl 


Section  6 


Processors  with  multiprogramming 
ability 

The  processors  in  this  section  have  features  which  allow  multi- 
ple programs  to  exist  in  the  primary  memory  at  the  same  time. 
The  programs  can  be  executed  alternately  by  a  single  processor 
without  having  to  wait  for  new  programs  to  be  input.  The  cost 
is  only  that  of  changing  the  processor  state,  which  involves  only 
a  few  instructions  at  most  (and  only  one  instruction  on  some 
systems,  such  as  the  CDC  6600).  Since  programs  are  subject 
to  numerous  unpredictable  delays  within  a  single  run  for  inter- 
change with  the  external  environment  (either  via  Ms  or  T), 
substantial  increases  in  Pc  utilization  can  be  achieved  by  multi- 
programming. If  more  than  a  single  processor  has  access  to 
Mp,  the  system  is  called  a  multiprocessor  system. 

Time-shared  computers  are  generally  multiprogrammed. 
Alternatively,  time-shared  systems  can  be  implemented  by 
swapping  programs,  one  at  a  time,  into  primary  memory  for 
interpretation.  The  Berkeley  Time-Sharing  System  (Chap.  24) 
uses  both  multiprogramming  and  program  swapping.  The 
Burroughs  B  5000  (Chap.  22)  is  an  early  computer  to  have 
multiprogram  capability.  The  idea  of  multiprogramming  is  so 
fundamental  that  it  should  be  among  the  first  concepts  to  be 
understood  by  the  student  of  computing  systems.  A  very  nice 
review  of  memory  mapping  and  storage  allocation  is  presented 
in  the  paper  Dynamic  Storage  Allocation  Systems  [Randell  and 
Kuehner,  1968]. 

Atlas 

The  Atlas  is  one  of  the  most  important  machines  described  in 
this  book.  The  prototype  was  originally  designed  and  con- 
structed at  Manchester  University.  The  Atlas  1  and  Atlas  2  were 
produced  by  Ferranti  Corp.  (prior  to  becoming  part  of  I.C.T. ')■ 
Atlas  1  is  the  most  interesting;  it  incorporates  most  of  the 
features  of  the  Atlas  prototype.  The  Lincoln  Laboratory  TX-2 
[Clark,  1957]  influenced  some  Atlas  features:  multiple  index 
registers  and  interrupt  processing  of  input/output  devices. 
Atlas'  detailed  internal  structure  is  described  in  a  paper  [Sum- 
ner et  al.,  1962]. 

'International  Computers  and  Tabulators,  U.  K. 


Two  original  features,  one-level  storage  and  extracodes,  have 
been  copied  in  many  other  machines.  A  one-level  store  is  com- 
mon to  most  new  computers  which  are  time-shared  or  multi- 
programmed:  the  scheme  for  memory  paging  in  the  SDS  940 
is  essentially  that  of  Atlas. 

The  extracodes  feature  allows  ordinary  machine  operation 
codes  to  be  used  to  call  subroutines.  Commonly  used  complex 
instructions  (such  as  sin,  cos,  and  monitor  calls)  can  be  written 
in  a  common  operating  system  accessible  to  all  users.  Initially 
these  subroutines  were  stored  in  a  read-only  memory. 

The  ISP  is  straightforward  and  extremely  nice.  The  extra- 
code  idea  appears  in  the  SDS  900  series  and  was  used  in  the 
SDS  940  system  for  defining  common-user  instructions.  The 
IBM  System/360  SVC  (supervisor  call)  instruction  is  an  adapta- 
tion of  the  extracode. 

Atlas  was  about  the  earliest  computer  to  be  designed  with 
a  software  operating  system  and  the  idea  of  user  machine  in 
mind.  The  operating  system  has  been  nicely  described  [Kilburn 
et  al.,  1961]  and  evaluated  [Morris  et  al.,  1967]. 

In  a  letter  to  the  authors  of  this  book,  F.  H.  Sumner  makes 
the  following  comments  on  Atlas. 

The  initial  ideas  and  the  preliminary  research  on  the  Atlas  computer 
system  started  in  the  Department  of  Computer  Science  of  the  Uni- 
versity of  IVlanchester  in  1956.  The  team,  under  the  direction  of 
Professor  T.  Kilburn,  was  later  supplemented  by  several  members 
of  the  I.C.T.  Computer  Research  Department,  and  the  prototype 
machine  was  working  in  the  department  by  the  Autumn  of  1961. 
The  first  production  model  became  operational  in  January  1963. 
The  significant  features  of  the  system  can  be  summarised  as: 

1  The  provision  of  a  virtual  address  field  greater  than  the  real 
address  space. 

2  The  implementation  of  a  "one-level"  store  using  a  mixture 
of  core  store  and  drum  store. 

3  The  interrupt  system  and  the  method  of  peripheral  control. 

4  The  realisation  at  the  design  stage  that  there  would  be  a 
complex  operating  system  and  the  provision  in  the  hardware 
of  specific  features  to  assist  such  an  operating  system. 


274 


Section  6  |  Processors  with  multiprogramming  ability  275 


The  method  of  peripheral  control  permitted  the  attachment  of 
a  large  number  of  on-line  peripherals  with  rapid  response  and  entry 
into  the  operating  system  for  a  peripheral  requiring  attention.  This, 
together  with  the  multiprogramming  features,  makes  the  design 
ideal  for  the  attachment  of  keyboards  for  the  provision  of  multi 
access  operation.  In  the  original  design,  provision  for  several  such 
on-line  typewriters  was  made,  but  at  the  production  stage  it  was 
decided  to  remove  these  as  an  economy  measure.  In  view  of  the 
subsequent  development  of  on-line  operation,  this  was  rather  an 
unfortunate  decision. 

The  Atlas  computer  at  the  University  has  now  been  in  continuous 
operation  for  four  years  and  it  is  expected  to  provide  for  the  major 
part  of  the  University's  computing  needs  until  1971. 

During  the  period  of  its  operation  the  provision  of  extensive 
monitoring  and  logging  information  has  permitted  the  behaviour  of 
the  system  to  be  studied  in  detail.  The  results  of  these  studies  have 
been  extremely  valuable  in  the  design  of  a  successor  to  the  Atlas. 

Design  of  the  B  5000  System 

The  Burroughs  B  5000  computer  is  described  in  Part  3.  Sec.  5. 
page  257,  Chap.  22. 

A  user  machine  in  a  time-sharing  system 

The  Berkeley  Time-Sharing  Computer  (Fig.  1)  is  based  on  the 
SDS  930  (Chap.  24).  The  hardvi/are  modifications  to  the  SDS 


930,  together  with  the  operating  system  software,  were  sold  by 
Scientific  Data  Systems  as  the  SDS  940.  The  operating  system 
and  hardware  modifications  for  multiprogramming  make  the 
940  one  of  the  first  commercially  available  combined  hardware- 
software  time-sharing  computers.' 

The  description  in  Chap.  24  is  concerned  with  the  machine 
as  it  appears  to  the  user.  That  is,  the  hardware  and  the  oper- 
ating system  software  are  both  presented  in  the  context  in 
which  they  contribute  to  form  a  user  machine. 

The  940  uses  a  memory  map  which  is  almost  a  subset  of 
that  of  Atlas  but  is  more  modest  than  that  of  the  IBM  360/67 
[Arden  et  al..  1966]  and  GE  645  [Dennis,  1965;  Daley  and 
Dennis,  1968].  A  number  of  instructions  are  apparently  built 
m  via  the  programmed  operator  calling  mechanism,  based  on 
Atlas  extracodes  (Chap.  23).  The  software-defined  instructions 
emphasize  the  need  for  hardware  features.  For  example,  float- 
ing-point arithmetic  is  needed  when  several  computer-bound 
programs  are  run.  The  SDS  945  is  a  successor  to  the  940,  with 
slightly  increased  capability  but  at  a  lower  cost. 


'  Time-shared  computers  consist  of  both  hardware  and  a  complex  software  operat- 
ing system.  Atlams  Computer  Cburactcristics  Quarterly  lists  the  deliveries  of  gen- 
eral-purpose time-shared  computers  as  DEC  PDP-6  hardware.  October.  1964 
(software  in  early  1965):  SOS  940  hardware  (and  Berkeley  software)  April.  1966; 
GE  635.  645  hardware.  May.  1965  (M.l.T.'s  project  MULTICS  software,  around 
1969):  IBM  System/360  Model  67  hardware.  March.  1966  (software,  around 
1968). 


Mp(#0:3) 


M(content  addressable:  flip  flop) 

-S(l|  Hp;  3  (P.K))— ^K{'Map)  Pc^  S-T-K  H5(magnetic  tape) 

aper  tape)- 
letype)- 


S-| — K  Ms  (mag 

L  K  T(pape 

L  K-S-T(Tele 


-Pio- 


 K  Ms  (drum;  2  ps/w;   1.3  X   10  w) 

8 

 K — Ms(moving  head  disk;   1.5  X   10  w) 

K-C('PDP-5)-i-K  T(CRT;  display)- 


K-C  (  'PDP-5)- 


'Mp(core;  1.75  us/w;  I638I4W;  (2'),1  parity)  b/w) 
^Pc( 'Modified  SDS  930),  see  Chapter  ^2 


-T(keyboard;  CRT;  display)- 


Fig.  1.  University  of  California  (Berkeley)  time-shared-computer  PMS  diagram. 


Chapter  23 

One-level  storage  system^ 


T.  Kilhurn  /  D.  B.  G.  Edwards  /  M.  J.  Lanigan 
F.  H.  Sumner 


Summary  After  a  brief  survey  of  the  basic  Atlas  machine,  the  paper 
describes  an  automatic  system  which  in  principle  can  be  applied  to  any 
combination  of  two  storage  systems  so  that  the  combination  can  be  regarded 
by  the  machine  user  as  a  single  level.  The  actual  system  described  relates 
to  a  fast  core  store-drum  combination.  The  effect  of  the  system  on  instruc- 
tion times  is  illustrated,  and  the  tape  transfer  system  is  also  introduced 
since  it  fits  basically  in  through  the  same  hardware.  The  scheme  incor- 
porates a  "learning"  program,  a  technique  which  can  be  of  greater  impor- 
tance in  future  computers. 

1.  Introduction 

In  a  universal  high-speed  digital  computer  it  is  necessary  to  have 
a  large-capacity  fast -access  main  store.  While  more  efficient  oper- 
ation of  the  computer  can  be  achieved  by  making  this  store  all 
of  one  type,  this  step  is  scarcely  practical  for  the  storage  capacities 
now  being  considered.  For  example,  on  Atlas  It  is  possible  to 
address  10^  words  in  the  main  store.  In  practice  on  the  first  instal- 
lation at  Manchester  University  a  total  of  10^  words  are  provided, 
but  though  it  is  just  technically  feasible  to  make  this  in  one  level 
it  is  much  more  economical  to  provide  a  core  store  (16,000  words) 
and  drum  (96,000  words)  combination. 

Atlas  is  a  machine  which  operates  its  peripheral  equipment  on 
a  time  division  basis,  the  equipment  "interrupting"  the  normal 
main  program  when  it  requires  attention.  Organization  of  the 
peripheral  equipment  is  also  done  by  program  so  that  many  pro- 
grams can  be  contained  in  the  store  of  the  machine  at  the  same 
time.  This  technique  can  also  be  extended  to  include  several  main 
programs  as  well  as  the  smaller  subroutines  used  for  controlling 
peripherals.  For  these  reasons  as  well  as  the  fact  that  some  orders 
take  a  variable  time  depending  on  the  exact  numbers  involved, 
it  is  not  really  feasible  to  "optimum"  program  transfers  of  infor- 
mation between  the  two  levels  of  store,  i.e.,  core  store  and  drum, 
in  order  to  eliminate  the  long  drum  access  time  of  6  msec.  Hence 
a  system  has  been  devised  to  make  the  core  drum  store  combi- 
nation appear  to  the  programmer  as  a  single  level  of  storage,  the 

i/fi£  TroiLv..  EC-U.  vol.  2,  pp.  22.3-2.3.5,  April,  1962. 


requisite  transfers  of  information  taking  place  automatically.  There 
are  a  number  of  additional  benefits  derived  from  the  scheme 
adopted,  which  include  relative  addressing  so  that  routines  can 
operate  anywhere  in  the  store,  and  a  "lock  out"  facility  to  prevent 
interference  between  different  programs  simultaneously  held  in 
the  store. 

2.    The  basic  machine 

The  arrangement  of  the  basic  machine  is  shown  in  Fig.  1.  The 
available  storage  space  is  split  into  three  sections;  the  private  store 
which  is  used  solely  for  internal  machine  organization,  the  central 
store  which  includes  both  core  and  drum  store,  in  which  all  words 
are  addressed  and  is  the  store  available  to  the  normal  user,  and 
finally  the  tape  store,  which  is  the  conventional  backing-up  large 
capacity  store  of  the  machine.  Both  the  private  store  and  the  main 
core  store  are  linked  with  the  main  accumulator,  the  B-store,  and 
the  B-arithmetic  unit.  However  the  drum  and  tape  stores  only  have 
access  to  these  latter  sections  of  the  machine  via  the  main  core 
store. 

The  machine  order  code  is  of  the  single  address  type,  and  a 
comprehensive  range  of  basic  fimctions  are  provided  by  normal 
engineering  methods.  Also  available  to  the  programmer  are  a 
number  of  extra  functions  termed  "extracodes"  which  give  auto- 
matic access  to  and  subsequent  return  from  a  large  number  of 
built-in  subroutines.  These  routines  provide 

1  A  number  of  orders  which  would  be  expensive  to  provide 
in  the  machine  both  in  terms  of  equipment  and  also  time 
because  of  the  extra  loading  on  certain  circuits.  An  example 
of  this  is  the  order; 

Shift  accumulator  contents  ±n  places  where  n  is  an  integer. 

2  The  more  complex  mathematical  operations,  e.g.,  sin  x, 
log  .V,  etc., 

3  Control  orders  for  peripheral  equipments,  card  readers, 
parallel  printers,  etc., 

4  Input-output  conversion  routines. 


Chapter  23  |  One-level  storage  system  277 


[  oddress 

IEitracode  I 


Fixed  store 
2  meshes 
2  »  4,096  wyds 


Ll  Subsidiorv  stofe  L!  J  fsl^"!!^. 


decode 
on  digits 
23,22,21 


1  'n_Jj2IL 


Address 
address 


in  ewe  sfoffl 
4  stacks 
4.09G  words 


-  \  j 

i  i 


equipiTwnts 


■    Address  chdnnBis 
— "     Intofmotion  channels 
(two  woy) 


Fig.  1.  Layout  of  basic  machine. 

5  Special  programs  concerned  with  storage  allocation  to 
different  programs  being  run  simultaneously,  monitoring 
routines  for  fault  finding  and  costing  purposes,  and  the 
detailed  organization  of  dnun  and  tape  transfers. 

.\11  this  information  is  permanently  required  and  hence  is  kept 
in  part  of  the  private  store  termed  the  "fixed  store"  [Kilhurn  and 
Grimsdale,  196()a]  which  operates  on  a  "read  only"  basis.  This  store 
consists  of  a  woven  wire  mesh  into  which  a  pattern  of  small 
"linear"  ferrite  slugs  are  inserted  to  represent  digital  information. 
The  information  content  can  only  be  changed  manually  and  will 
tend  to  differ  only  in  detail  between  the  different  versions  of  the 
.'Mlas  computer.  In  N4use  this  store  is  arranged  in  two  units  each 
of  4096  words,  a  unit  consisting  of  16  columns  of  256  words,  each 
word  being  50  bits.  The  access  time  to  a  word  in  any  one  column 
is  about  0.4  jusec.  If  a  change  of  column  address  is  required,  this 
figiue  increases  by  about  1  jusec  due  to  switching  transients  in  the 
read  amplifiers.  Subsequent  accesses  in  the  new  column  revert  to 
0.4  ftsec.  The  store  operates  in  conjunction  with  a  subsidiary  core 
store  of  1024  words  which  provides  working  space  for  the  fixed 
store  programs,  and  has  a  cvcle  time  of  about  1.8  ;usec.  There  are 
certain  safeguards  against  a  normal  machine  user  gaining  access 
to  addresses  in  either  part  of  the  private  store,  though  in  effect 
he  makes  use  of  this  store  through  the  extracode  facility. 

The  central  store  of  the  machine  consists  of  a  drum  and  core 
store  combination,  which  has  a  ma.ximum  addressable  capacity  of 


about  10''  words.  In  Muse  the  central  store  capacity  is  about  96,0(H) 
words  contained  on  4  dnims.  Any  part  of  this  store  can  be  trans- 
ferred in  blocks  of  512  words  to/from  the  main  core  store,  which 
consists  of  four  separate  stacks,  each  stack  having  a  capacity  of 
4096  words. 

The  tape  system  provides  a  very  large  capacity  backing  store 
for  the  machine.  The  user  can  effect  transfers  of  variable  amounts 
of  information  between  this  store  and  the  central  store.  In  actual 
fact  such  transfers  are  organized  by  a  fixed  store  program  which 
initiates  automatic  transfers  of  blocks  of  512  words  between  the 
tape  store  and  the  main  core  store.  The  system  can  handle  eight 
tape  decks  nmning  simultaneously,  each  producing  or  demanding 
a  word  on  average  every  SS  jjsec. 

The  main  core  store  address  can  thus  be  provided  from  either 
the  central  machine,  the  dnun,  or  the  tape  system.  Since  there 
is  no  synchronization  between  these  addresses,  there  has  to  be  a 
priority  system  to  allocate  addresses  to  the  core  store.  The  drum 
has  top  priority  since  it  delivers  a  word  every  4  /xsec,  the  tape 
next  priority  since  words  can  arise  every  II  jusec  from  8  decks 
and  the  machine  uses  the  core  store  for  the  rest  of  the  available 
time.  .\  priority  system  necessarily  takes  time  to  establish  its 
priority,  and  so  it  has  been  arranged  that  it  comes  into  effect  only 
at  each  drum  or  tape  request.  Thus  the  machine  is  not  slowed 
down  in  any  way  when  no  drum  or  tape  transfers  take  place.  The 
effect  of  drum  and  tape  transfers  on  machine  speed  is  given  in 
.■\ppendix  1 . 

To  simplify  the  control  commands  given  to  the  drum,  tape,  and 
peripheral  equipment  in  the  machine,  the  orders  all  take  the  form 
b—fSoTs^B  and  the  identification  of  the  required  command 
register  is  provided  by  the  address  S.  This  type  of  storage  is  clearly 
widely  scattered  in  the  machine  but  is  termed  collectively  the 
\'-store. 

In  the  central  machine  the  main  accumulator  contains  a  fast 
adder  [Kilbum  et  al.,  1960b]  and  has  built-in  multiplication  and 
division  facilities.  It  can  deal  with  fixed  or  floating  point  numbers 
and  its  operation  is  completely  independent  of  the  B-store  and 
B-arithmetic  unit.  The  B-store  is  a  fast  core  store  (cycle  time  0.7 
lasec)  of  120  twenty-four  bit  words  operating  in  a  word  selected 
partial  flax  switching  mode  [Edwards  et  al.,  I960].  Eight  "fast" 
B  lines  are  also  provided  in  the  form  of  flip-flop  registers.  Of  these, 
three  are  used  as  control  lines,  termed  main,  extracode,  and  inter- 
rupt controls  respectively.  The  arrangement  has  the  advantage 
that  the  control  numbers  can  be  manipulated  by  the  normal  B-type 
orders,  and  the  existence  of  three  controls  permits  the  machine 
to  switch  rapidly  from  one  to  another  without  having  to  transfer 
control  numbers  to  the  core  store.  Main  control  is  used  when  the 


Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


Function 

Bo 

B„ 

Address 

(0  bits 

7  bits 

7  bits 

24  bits 

23  22  21120  (9  t8  17  16  15  14  Kh  12        10  9 


7    6    5    4    3  2 


-Block  Qddress- 


-Lme  address  - 


Address  in  central  store  (core  store  ond  drum) 


0    0    f     0   0   0 -^Column— ■ :.  Lif 
Mesh-4  address  ' 

Mesh  B  Address  in  fixed  store 


0000000  0, 


-Line  oddress- 


Address  in  subsidiary  store 


\ 

0 

0   0  0 

n  n   n    Locatiori  ■ 
of  registers' 
Address  in 

 Line  oddress- —  » 

V  store 

Most  significont  half-word 

Least  significant  half-word 

Most  significant  chorocter 

Least  significant  character 

(4) 

47 

46 

45  44  43  42 

41  40  39  38 

0 

0 

0 

0    8  8 

8    8    8    8  — 

0 

0 

0 

1 

B 

codes 

0 

0 

1 

0 

B 

test  codes 

0 

0 

I 

t 

A 

codes 

0 

0 

0 

0 

\ 

0 

1 

B 

codes  and  extracode  return 

0 

1 

i 

0 

0 

1 

1 

1 

A 

codes  ond  extracode  returr 

0 

0 

s 

8 

B 

type  extracode 

1 

1 

s 

8 

A 

type  extracode 

U) 

Exponent 

Mantissa 

Y;  8  bits 

Including  sign 

40  bits  including  sign 

(</) 


central  machine  is  obeying  the  current  program,  while  the  extra- 
code  control  is  concerned  with  the  fixed  store  subroutines.  The 
interrupt  control  provides  the  means  for  handling  numerous  pe- 
ripheral equipments  which  "internipt"  the  machine  when  they 
either  require  or  are  providing  information.  The  remaining  "fast" 
B  lines  are  mainly  used  for  organizational  procedures,  though  B124 
is  the  floating  point  accumulator  exponent. 

The  operating  speed  of  the  machine  is  of  the  order  of  0.5  X  10'^ 
instructions  per  second.  This  is  achieved  by  the  use  of  fast  tran- 
sistor logic  circuitry,  rapid  access  to  storage  locations,  and  an 
extensive  overlapping  technique.  The  latter  procedure  is  made 
possible  by  the  provision  of  a  number  of  intermediate  buffer  stor- 
age registers,  separate  access  mechanisms  to  the  individual  units 
of  core  store  and  parallel  operation  of  the  main  accumulator  and 
B-arithmetic  units.  The  word  length  throughout  the  machine  is 
48  bits  which  may  be  considered  as  two  half-words  of  24  bits  each. 
All  store  transfers  between  the  central  machine,  the  drum  and  tape 
stores  are  parity  checked,  there  being  a  parity  digit  associated  with 
each  half-word.  In  the  case  of  transfers  within  the  central  store 
(i.e.,  between  main  core  store  and  dnmi)  the  parity  digits  associ- 
ated with  a  given  word  are  retained  throughout  the  system.  Tape 
transfers  are  parity  checked  when  information  is  transferred  to 
and  from  the  main  core  store,  and  on  the  tape  itself  a  check  sum 
technique  involving  the  use  of  two  closely  spaced  heads  is  used. 

The  form  of  the  instruction,  which  allows  for  two  B-modifica- 
tions,  and  the  allocation  of  the  address  digits  is  shown  in  Fig.  2a. 
Half  of  the  addressable  store  locations  are  allocated  to  the  central 
store  which  is  identified  by  a  zero  in  the  most  significant  digit 
of  the  address.  (See  Fig.  2i).)  This  address  can  be  further  subdivided 
into  block  address,  and  line  address  in  a  block  of  512  words.  The 
least  significant  digits,  0  and  1,  make  it  possible  to  address  6  bit 
characters  in  a  half  word  and  digit  2  specifies  the  half  word. 

The  function  number  is  split  into  several  sections,  each  section 
relating  to  a  particular  set  of  operations,  and  these  are  listed  in 
Fig.  2c.  The  machine  orders  fall  into  two  broad  classes,  and  these 
are 

1  B  codes:  These  involve  operations  between  a  B  line  specified 
by  the  B^  digits  in  the  instruction  and  a  core  store  line 
whose  address  can  be  modified  by  the  contents  of  a  B  line 
determined  by  the  B,^  digits.  There  are  a  total  of  128  B 
lines,  one  of  which,  B,,,  always  contains  zero.  Of  the  other 
lines  90  are  available  to  the  machine  user,  7  are  special 
registers  previously  mentioned,  and  a  fiu-ther  30  are  used 
by  extracode  orders. 

2  A  codes:  These  involve  operations  between  the  Accumulator 
and  a  core  store  line  whose  address  can  now  be  doubly 


Fig.  2.  Interpretation  of  a  word,  (a)  Form  of  instruction,  (b)  Allocation 
of  address  digits,  (c)  Function  of  decoding,  (d)  Floating-point  number 
X8'. 


modified  first  by  contents  of  B„,  and  then  bv  the  contents 
of  Bj.  Both  fixed  and  floating  point  orders  are  provided,  and 
in  the  latter  case  numbers  take  the  form  of  A'8',  the  digit 
allocation  of  X  and  V  being  shown  in  Fig.  2d.  When  fixed 
point  working  occurs,  use  is  made  only  of  the  A'  digits. 


Chapter  23  |  One-level  storage  system  279 


3.    One-level  store  concept 

The  choice  of  system  for  the  fast  access  store  in  a  large  scale 
compviter  is  governed  by  a  number  of  conflicting  factors  which 
include  speed  and  size  requirements,  economic  and  technical 
difficulties.  Previously  the  problem  has  been  resolved  in  two  ex- 
treme cases  either  by  the  provision  of  a  very  large  core  store,  e.g., 
the  2.5  megabit  [Papian,  1957]  store  at  M.I.T.,  or  by  the  use  of 
a  small  core  store  (40,000  bits)  expanded  to  640,000  bits  by  a  dnmi 
store  as  in  the  Ferranti  Mercury  [Lonsdale  and  VVarburton,  1956; 
Kilburn  et  al.,  1956]  computer.  Each  of  these  methods  has  its 
disadvantages,  in  the  first  case,  that  of  expense,  and  in  the  second 
case,  that  of  inconvenience  to  the  user,  who  is  obliged  to  program 
transfers  of  information  between  the  two  types  of  store  and  this 
can  be  time  consuming.  In  some  instances  it  is  possible  for  an 
expert  machine  user  to  arrange  his  program  so  that  the  amount 
of  time  lost  by  the  transfers  in  the  two-level  storage  arrangement 
is  not  significant,  but  this  sort  of  "optimum"  programming  is  not 
verv  desirable.  Suitable  interpretative  coding  [Brooker,  1960]  can 
permit  the  two-level  system  to  appear  as  one  level.  The  effect  is, 
however,  accompanied  bv  an  effective  loss  of  machine  speed 
which,  in  some  programs  and  depending  on  details  of  machine 
design,  can  be  quite  severe,  varying  typically,  for  example,  be- 
tween one  and  three. 

The  two-level  storage  scheme  has  obvious  economic  advan- 
tages, and  inconvenience  to  the  machine  user  can  be  eliminated 
by  making  the  transfer  arrangements  completelv  automatic.  In 
Atlas  a  completely  automatic  system  has  been  provided  with  tech- 
niques for  minimizing  the  transfer  times.  In  this  wav  the  core 
and  drum  are  merged  into  an  apparent  single  level  of  storage  with 
good  performance  and  at  moderate  cost.  Some  details  of  this  ar- 
rangement on  the  Muse  are  now  provided. 

The  central  store  is  subdivided  into  blocks  of  512  words  as 
shown  by  the  address  arrangements  in  Fig.  2/).  The  main  core  store 
is  also  partitioned  into  blocks  of  this  size  which  for  identification 
purposes  are  called  pages,  .\ssociated  with  each  of  these  core  store 
page  positions  is  a  "page  address  register"  (P.A.R.)  which  contains 
the  address  of  the  block  of  information  at  present  occupying  that 
page  position.  When  access  to  any  word  in  the  central  store  is 
required  the  digits  of  the  demanded  block  address  are  compared 
with  the  contents  of  all  the  page  address  registers.  If  an  "equiva- 
lence" indication  is  obtained  then  access  to  that  particular  page 
position  is  permitted.  Since  a  block  can  occupv  anv  one  of  the 
32  page  positions  in  the  core  store  it  is  necessary  to  modify  some 
digits  of  the  demanded  block  address  to  conform  with  the  page 
positions  in  which  an  equivalence  was  obtained. 


These  processes  are  necessarily  time  consuming  but  by  provid- 
ing a  by-pass  of  this  procedure  for  instruction  accesses  (since,  in 
general,  instniction  loops  are  all  contained  in  the  same  block)  then 
most  of  this  time  can  be  overlapped  with  a  useful  portion  of  the 
machine  or  core  store  rhvthin.  In  this  wav  information  in  the  core 
store  is  available  to  the  machine  at  the  full  speed  of  the  core  store 
and  only  rarely  is  the  over-all  machine  speed  affected  bv  delays 
in  the  equivalence  circuitry. 

If  a  "not  equivalence"  indication  is  obtained  when  the  de- 
manded block  address  is  compared  with  the  contents  of  the 
P  .V.R.'s  then  that  address,  which  may  have  been  B-modified,  is 
first  stored  in  a  register  which  can  be  accessed  as  a  line  of  the 
V'-store.  This  permits  the  central  machine  easy  access  to  this  ad- 
dress. An  "interrupt"  also  occurs  which  switches  operation  of  the 
machine  over  to  the  interrupt  control,  which  first  determines  the 
cause  of  the  interrupt  and  then,  in  this  instance,  enters  a  fixed 
store  routine  to  organize  the  necessary  transfers  of  information 
between  dnmi  and  core  store. 

A.  Drum  transfers 

On  each  dnim,  one  track  is  used  to  identify  absolute  block  posi- 
tions around  the  drum  periphery.  The  records  on  these  tracks  are 
read  into  the  6  registers  which  can  be  accessed  as  lines  of  the 
\ -store  and  this  permits  the  present  angidar  drum  position  to  be 
determined,  though  ontv  in  units  of  one  block.  In  this  wav  the 
time  needed  to  transfer  any  block  while  reading  from  the  drums 
can  be  assessed.  This  time  varies  between  2  and  14  msec  since 
the  dmm  revolution  time  is  12  msec  and  the  actual  transfer  time 
2  msec. 

The  time  of  a  writing  transfer  to  the  drums  has  been  reduced 
b\-  writing  the  block  of  information  to  the  first  available  emptv 
block  position  on  any  dmm.  Thus  the  access  time  of  the  dmm 
can  be  eliminated  provided  there  are  a  reasonable  number  of 
empty  blocks  on  the  dmm.  This  means,  however,  that  transfers 
to  from  the  dnuii  have  to  be  carried  out  bv  reference  to  a  direc- 
tory and  this  is  stored  in  the  subsidiar)'  store  and  up-dated  when- 
ever a  transfer  occurs. 

When  the  dmm  transfer  routine  is  entered  the  first  action  is 
to  determine  the  absolute  position  on  a  drum  of  the  required  block. 
The  order  is  then  given  to  carry  out  the  transfer  to  an  emptv  page 
position  in  the  core  store.  The  transfer  occurs  automaticallv  as 
soon  as  the  dmm  reaches  the  correct  angular  position.  The  page 
address  register  in  the  vacant  position  in  the  core  store  is  set  to 
a  specific  block  number  for  dmm  transfers.  This  technique  sim- 
plifies the  engineering  with  regard  to  the  provision  of  this  number 


Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


from  the  dnim  and  also  provides  a  safeguard  against  transferring 
to  the  wrong  block. 

As  soon  as  the  order  asking  for  a  read  transfer  from  the  drum 
has  been  given  the  machine  continues  with  the  drum  transfer 
program.  It  is  now  concerned  with  determining  a  block  to  be 
transferred  back  from  the  core  store  to  the  dnmi.  This  is  necessary 
to  ensure  an  empty  core  store  page  position  when  the  next  read 
transfer  is  required.  The  block  in  the  core  store  to  be  transferred 
has  to  be  carefullv  chosen  to  minimize  the  number  of  transfers 
in  the  program  and  this  optimization  process  is  carried  out  by  a 
learning  program,  details  of  which  are  given  in  Sec.  5.  The  opera- 
tion of  this  program  is  assisted  by  the  provision  of  the  "use"  digits 
which  are  associated  with  each  page  position  of  the  core  store. 

To  interchange  information  between  the  core  store  and  drums, 
two  transfers,  a  read  from  and  a  write  to  the  dnun  are  necessary. 
These  have  to  be  done  sequentially  but  could  occur  in  either  order. 
The  technique  of  having  a  vacant  page  position  in  the  core  store 
permits  a  read  transfer  to  occur  first  and  thus  allows  the  time  for 
the  learning  program  to  be  overlapped  either  into  the  waiting 
period  for  the  read  transfer  or  into  the  transfer  time  itself.  In  the 
time  remaining  after  completion  of  the  learning  program  an  entry 
is  made  into  the  over-all  supervisor  program  for  the  machine,  and 
a  decision  is  taken  concerning  what  the  machine  is  to  do  until 
the  drum  transfer  is  completed.  This  might  involve  a  change  to 
a  different  main  program. 

A  program  could  ask  for  access  to  information  in  a  page  position 
while  a  drum  or  tape  transfer  is  taking  place  to  that  page.  This 
is  prevented  in  Atlas  by  the  use  of  a  "lock  out"  (L.O.)  digit  which 
is  provided  with  each  Page  Address  Register.  When  a  lock  out 
digit  is  set  at  1,  access  to  that  page  is  onlv  permitted  when  the 
address  has  been  provided  either  by  the  drmu  system,  the  tape 
system,  or  the  interrupt  control.  The  latter  case  permits  all  trans- 
fers from  paper  tape,  punched  card,  and  other  peripheral  equip- 
ments, to  be  handled  without  interference  from  the  main  program. 
When  the  transfer  of  a  block  has  been  completed  the  organizing 
program  resets  the  L.O.  digit  to  zero  and  access  to  that  page 


position  can  then  be  made  from  the  central  machine.  It  is  clear 
that  the  L.O.  digit  can  also  be  used  to  prevent  interference  be- 
tween programs  when  several  different  ones  are  being  held  in  the 
machine  at  the  same  time. 

In  Sec.  .3  it  was  stated  that  addresses  demanding  access  to  the 
core  store  could  arise  from  three  distinct  sources,  the  central 
machine,  the  drum,  and  the  tape.  These  accesses  are  complicated 
because  of  (I)  the  equivalence  technique,  and  (2)  the  lock  out  digit. 
The  various  cases  and  the  action  that  takes  place  are  summarized 
in  Table  L 

The  provision  of  the  Page  Address  Registers,  the  equivalence 
circuitry,  and  the  learning  program  have  permitted  the  core  store 
and  drum  to  be  regarded  by  the  ordinary  machine  user  as  a  one- 
level  store,  and  the  system  has  the  additional  feature  of  "floating 
address  '  operation,  i.e.,  any  block  of  information  can  be  stored 
in  any  absolute  position  in  either  core  or  drum  store.  The  minimum 
access  time  to  information  in  this  store  is  obviously  limited  by 
the  core  store  and  its  arrangement  and  this  is  now  discussed. 

B.  Core  store  arrangement 

The  core  store  is  split  into  four  stacks,  each  with  individual  address 
decoding  and  read  and  write  mechanisms.  The  stacks  are  then 
combined  in  such  a  way  that  common  channels  into  the  machine 
for  the  address,  read  and  write  digits  are  time  .shared  between 
the  various  stacks.  Sequential  address  positions  occur  in  two  stacks 
alternately  and  a  page  position  which  contains  a  block  of  512 
sequential  addresses  is  thus  arranged  across  two  stacks.  In  this  way 
it  is  possible  to  read  a  pair  of  instructions  from  consecutive  ad- 
dresses in  parallel  by  increasing  the  size  of  the  read  channel.  This 
permits  two  instructions  to  be  completely  obeyed  in  three  store 
"accesses."  The  choice  of  this  particular  storage  arrangement  is 
discussed  in  Appendix  2. 

The  coordination  of  these  four  stacks  is  done  by  the  "core  stack 
coordinator"  and  some  features  of  this  are  now  discussed,  starting 
with  the  operation  of  a  single  stack. 


Table  1    Comparison  of  demanded  block  address  with  contents  of  the  P.A.R.'s  resultant  state  of  equivalence  and  lock  out  circuits 


I  E(iuivutence  1 

I  Equivalence  1 

(Lock  out  =  oj 

Not  equivalence 

I  Lock  out  =  J  J 

Source  of  address 

[E.Q.] 

[XE.Q.] 

[£.(?.  6  L.O.] 

1.  Central  Machine 

Access  to  required  page  position 

Enter  drum  transfer  routine 

Not  available  to  this  program 

2.  Drum  System 

Access  to  required  page  position 

Fault  condition  indicated 

Fault  condition  indicated 

3.  Tape  System 

Access  to  required  page  position 

Fault  condition  indicated 

Fault  condition  indicated 

Chapter  23  {  One-level  storage  system  281 


C.  Operation  of  a  single  stack  of  core  store 

The  storage  system  employed  is  a  coincident  current  M.I.T.  system 
arranged  to  give  parallel  read  out  of  50  digits.  The  reading  opera- 
tion is  destructive  and  each  read  phase  of  the  stack  cycle  is  fol- 
lowed by  a  write  phase  during  which  the  information  read  out 
may  be  rewritten.  This  is  achieved  by  a  set  of  digit  staticizors 
which  are  loaded  during  the  read  phase  and  are  used  to  control 
the  inhibit  current  drivers  during  the  write  phase.  When  new 
information  is  to  be  written  into  the  store  a  similar  sequence  is 
followed,  except  that  the  digit  staticizors  are  loaded  with  the  new 
information  during  the  read  phase.  A  diagram  indicating  the 
different  types  of  stack  cycle  is  shown  in  Fig.  .3. 


Slack 
request 

Reod 
phase 
Reod 
strobe 

Write 
phose 


stock  — I  p 
request  — ' 


Reod 
phase 

Write 

strobe 

Write 
phose 


Stock  — I  r- 

request     |  I 

Reod  — I  

phase  1 

Reod  — i  

strobe 

Write  — i  

strobe 


u 


Write 
phose 


Ti,  =  occess  time,       =  cyclic  time,  ifp  =  wait  for  address  decoding 
and  loading  of  oddress  register.     ^  =  woit  for  release  of  write  hold 


Fig.  3.  Basic  types  of  stack  cycle,  (a)  Read  order  (s  - 
(a     s).  (c)  Read-write  order  {b  +  s  ^  S). 


A),  (b)  Write  order 


There  is  a  small  delay  (=rlOO  mjisec)  between  the  "stack 
request"  signal,  Sfl,  and  the  start  of  the  read  phase  to  allow  for 
setting  of  the  address  state  and  the  address  decoding.  The  output 
information  from  the  store  appears  in  the  read  strobe  period,  which 
is  towards  the  end  of  the  read  phase.  In  general,  the  write  phase 
starts  as  soon  as  the  read  phase  ends.  However,  the  start  of  the 
write  phase  may  be  held  up  until  the  new  information  is  available 
from  the  central  machine.  This  delay  is  shown  as  \V'^  in  Fig.  .3c. 
The  interval  between  the  stack  request  and  the  read  strobe 
is  termed  the  stack  access  time,  and  in  practice  this  is  approxi- 
mately one  third  of  the  cycle  time  T^..  Both  and  are  fimctions 
of  the  storage  system  and  assuming  that  is  zero  have  tvpical 
values  of  0.7  /n-sec  and  1.9  fisec  respectively.  A  holdup  gate  in  the 
request  channel  prevents  the  next  stack  request  occurring  before 
the  end  of  the  preceding  write  phase. 

D.  Operation  of  the  main  core  store  with  the  central  machine 

A  schematic  diagram  of  the  essentials  of  the  main  core  store  con- 
trol system  is  shown  in  Fig.  4.  The  control  signals  SA^  and  SA, 
indicate  whether  the  address  presented  is  that  of  a  single  word 
or  a  pair  of  sequentially  addressed  instructions,  .\ssuming  that  the 
flip-flop  F  is  in  the  reset  condition,  either  of  these  signals  results 
in  the  loading  of  the  buffer  address  register  (B..-\.R.).  This  loading 
is  done  b>-  the  signal  B..\.B..-\.  which  also  indicates  that  the  buffer 
register  in  the  central  machine  has  become  free. 

In  dealing  with  the  first  request  the  block  address  digits  in  the 
B.-A.R.  are  compared  with  the  contents  of  all  the  page  address 
registers.  Then  one  of  the  indications  summarized  in  Table  1  and 
indicated  in  Fig.  4  is  obtained,  .\ssuming  access  to  the  required 
store  stack  is  permitted  then  a  set  C.S.F.  signal  is  given  which 
resets  the  flip-flop  F.  If  this  occurs  before  the  next  access  request 
arises,  then  the  speed  of  the  system  is  not  store-limited.  In  most 
cases  SET  CSF  is  generated  when  the  equivalence  operation  on 
the  demanded  block  address  is  complete,  and  the  read  phase  of 
the  appropriate  stack  (or  stacks)  has  started.  Until  this  time  the 
information  held  in  the  B..\.R.  must  not  be  allowed  to  change. 
In  Fig.  5  a  flow  diagram  is  shown  for  the  various  cases  which  can 
arise  in  practice. 

When  a  single  address  request  is  accepted  it  is  necessary  to 
obtain  an  "equivalence"  indication  and  form  the  page  location 
digits  before  the  stack  request  can  be  generated.  The  SET  CSF 
signal  then  occurs  as  soon  as  the  read  phase  starts.  If  a  "not  equiva- 
lent" or  "equivalent  and  locked  out"  indication  is  obtained  a  stack 
request  is  not  generated,  and  the  contents  of  the  B..\.R.  are  copied 
in  to  a  line  of  the  V'-store  before  SET  CSF  is  generated. 

When  access  to  a  pair  of  addresses  is  requested  [i.e..  an  instruc- 


282  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


Stock  0       Stock  1 

Poge 

0 

Page 

1 

Page 

15 

Stock  2      Stock  3 

Poge 

16 

Page 

17 

Page 

31 

Main      core  store 


Fig.  4.  IVlain  core  store  control. 


tion  pair)  the  stack  requests  are  generated  on  the  assumption  that 
these  instructions  are  located  in  the  same  page  position  as  the  last 
pair  requested,  i.e.,  the  page  position  digits  are  taken  from  the 
page  digit  register.  (See  Fig.  4.)  In  this  way  the  time  required  to 
obtain  the  equivalent  indication  and  form  the  page  location  digits 
is  not  included  in  the  over-all  access  time  of  the  svstem.  The 
assumption  will  normally  be  true,  except  when  crossing  block 
boundaries.  The  latter  cases  are  detected  and  corrected  by  com- 
paring the  true  position  page  digits  obtained  as  a  result  of  the 


equivalence  operation  with  the  contents  of  the  page  digit  register 
and  a  "right  page"  or  "wrong  page"  indication  is  obtained.  (See 
Fig.  4.)  If  a  wrong  page  is  accessed  this  is  indicated  to  the  central 
machine  and  the  read  out  is  inhibited.  The  true  page  location 
digits  are  copied  into  the  page  digit  register,  so  that  the  required 
instruction  pair  will  be  obtained  when  next  requested.  The  read 
out  to  the  central  machine  is  also  inhibited  for  "not  equivalent" 
or  "equivalent  and  locked  out"  indications. 

In  Fig.  5  the  waiting  time  indicated  immediately  before  the 
stack  request  is  generated  can  arise  for  a  number  of  reasons. 

1  The  preceding  write  phase  of  that  stack  has  not  yet  finished. 

2  The  central  machine  is  not  yet  ready  either  to  accept  infor- 
mation from  the  store,  or  to  supplv  information  to  it. 


Won  for 
equivalence 
and  formotio 
of  page  digits 


Not  equivalent 
or  equivalent 
ond  locked 


Copy  B.AR  Stock 
to     line  request 


SET  CSF      SET  CSF         SET  CSF 


Fig.  5.  Flow  diagram  of  main  core  store  control. 


Chapter  23  |  One-level  storage  system  283 


3  It  is  necessary  to  ensure  a  certain  minimum  time  between 
successive  read  strobes  from  the  core  store  stacks  to  allow 
satisfactory  operation  of  the  parity  circuits,  which  take 
about  0.4  fisec  to  check  the  information.  This  time  could 
be  reduced,  but  as  it  is  only  possible  to  get  such  a  condition 
for  a  small  part  of  the  normal  instruction  timing  cycle  it 
was  not  thought  to  be  an  economical  proposition. 

The  basic  machine  timing  is  now  discussed. 
4.    Instruction  times 

In  high-speed  computers,  one  of  the  main  factors  limiting  speed 
of  operation  is  the  store  cycle  time.  Here  a  number  of  techniques, 
e.g.,  splitting  the  core  store  into  four  separate  stacks  and  extracting 
two  instructions  in  a  single  cycle,  have  been  adopted  despite  a 
fast  basic  cycle  time  of  2  fisec  in  order  to  alleviate  this  situation. 
The  time  taken  to  complete  an  instniction  is  dependent  upon 

1  The  tvpe  of  instruction  (which  is  defined  bv  the  function 
digits) 

2  The  exact  location  of  the  instniction  and  operand  in  the 
core  or  fi.xed  store  since  this  can  affect  the  access  time 

3  Whether  or  not  the  operand  address  is  to  be  modified 

4  In  the  case  of  floating  point  accumulator  orders,  the  actual 
numbers  themselves 

5  Whether  dnun  and/or  tape  transfers  are  taking  place 


The  approximate  times  for  various  instructions  are  given  in 
Table  2.  These  figures  relate  to  the  times  between  completing 
instructions  when  a  long  sequence  of  the  same  type  of  instniction 
is  obeyed.  While  this  method  is  not  ideal,  it  is  necessary  because 
in  practice  obeying  one  instniction  is  overlapped  in  time  with 
some  part  of  three  other  instnictions.  This  makes  the  detailed 
timing  complicated,  and  so  the  timing  sequence  is  developed 
slowly  by  first  considering  instnictions  obeyed  one  after  another. 
It  is  convenient  to  make  these  instructions  a  sequence  of  floating 
point  additions  with  both  instniction  and  operand  in  the  core  store 
and  with  the  operand  address  single  B-modified. 

To  obey  this  instruction  the  central  machine  makes  two  re- 
quests to  the  core  store,  one  for  the  instruction  and  the  second 
for  the  operand.  .After  the  instruction  is  received  in  the  machine 
the  hinction  part  has  to  be  decoded  and  the  operand  address 
modified  by  the  contents  of  one  of  the  B  registers  before  the 
operand  request  can  be  made.  Finally,  after  the  operand  has  been 
obtained  the  actual  accumulator  addition  takes  place  to  complete 
the  instruction.  The  time  from  beginning  to  end  of  one  instniction 
is  6.0.5  (isec  and  an  approximate  timing  schedule  is  as  follows  in 
Table  3. 

If  no  other  action  is  permitted  in  the  time  required  to  complete 
the  instruction  (steps  1  to  8  in  Table  3),  then  the  different  sections 
of  the  machine  are  being  used  very  inefficiently,  e.g.,  the  accumu- 
lator adder  is  onlv  used  for  less  than  1.1  /usee.  However,  the  orga- 
nization of  the  computer  is  such  that  the  different  sections  such 
as  store  stacks,  accumulator  and  B-arithmetic  unit,  can  operate 


Table  2    Approximate  instruction  times 


Instruction  in  core 

Instnictions  in  fixed 

Instnictions  in  fixed 

Nuinber  of 

store.  Operands  in 

store.  Operands  in 

store.  Operands  in 

modifications  of 

core  store.  Time 

core  store.  Tirrw 

fixed  -Store.  Time 

Tifpe  of  iiistrtictioii 

address 

in  lisec 

in  fisec 

in  ftspf 

Floating  Point  Addition 

0 

1.4 

1.65 

1.2 

1 

1.6 

1.65 

1.2 

2 

2.03 

1.9 

1.9 

Floating  Point  Multiplication 

0,  1  or  2 

4.7 

4.7 

4.7 

Floating  Point  Division 

0,  1  or  2 

13.6 

13.6 

13.6 

Add  Store  Line  to  an  Index  Register 

0 

1.53 

1.65 

1.15 

1 

1.85 

1.85 

1.85 

Add  Index  Register  to  Store  Line  and  Rewrite  to 

0 

1.63 

1.65 

Store  Line 

1 

1.8 

1.7 

284  Part  3     The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


Table  3-|-  Timing  sequence  for  floating  point  addition  (instructions 
and  operands  in  the  core  store) 


Time  interval 

Total 

between  steps 

time 

Seqtience 

usee 

fisec 

1. 

Add  1  to  Main  Control 

0 

(Addition  time) 

0.3 

2. 

Make  Instruction  Request 

0.3 

(Transfer  times,  equivalence  time 

and  stack  access  time) 

1.75 

3. 

Receive  Instruction  In  Central  Machine 

2.05 

(Load  register  and  decode) 

0.2 

4. 

Function  decoding  complete 

2.25 

(Single  address  modification) 

0.85 

5. 

Request  Operand 

3.10 

(Transfer  times,  equivalence  time 

and  stack  access  time) 

1 .75 

6. 

Receive  Operand  In  Central  Machine 

4.85 

(Load  register) 

0.1 

7. 

Start  Addition  In  Accumulator 

4.95 

(Average  floating  point  addition, 

including  shift  round  and  stand- 

ardise) 

1.1 

8. 

Instruction  complete 

6.05 

t  In  step  4,  time  is  for  single  address  modification.  Times  for  no  modification 
and  two  modifications  are  0.25  jisec  and  1.55  usee  respectively. 


at  the  same  time.  In  this  way  several  instructions  can  be  started 
before  the  first  has  finished,  and  then  the  effective  instruction  time 
is  considerably  reduced.  There  have,  of  course,  to  be  certain  safe- 
guards when  for  example  an  instruction  is  dependent  in  any  way 
on  the  completion  of  a  preceding  instruction. 

In  the  time  sequence  previously  tabulated,  by  far  the  longest 
time  was  that  between  a  request  in  the  central  machine  for  the 
core  store  and  the  receipt  in  the  central  machine  of  the  infor- 
mation from  that  store.  This  effective  access  time  of  1.75  ;asec  is 
made  up  as  shown  in  Table  4.  It  has  been  reduced  in  practice 
by  the  provision  of  two  buffer  registers,  one  in  the  central  machine 
and  the  other  in  the  core  stack  coordinator.  These  allow  the 
equivalence  and  transfer  times  to  be  overlapped  with  the  organi- 
zation of  requests  in  the  central  machine. 

In  this  way,  provided  the  machine  can  arrange  to  make  requests 
fast  enough,  then  the  effective  access  time  is  reduced  to  0.8  jisec. 
Further,  since  three  accesses  are  needed  to  complete  two  instmc- 
tions  (one  for  an  instruction  pair  and  one  for  each  of  the  two 
operands)  the  theoretical  minimum  time  of  an  instruction  is  1.2 
;usec  .3x0.8/2  and  it  then  becomes  store  limited.  Reference  to 


Table  3  shows  that  the  arithmetic  operation  takes  1.2  /usee  to 
complete  so  that,  on  the  average,  the  capabilities  of  the  store  and 
the  accumulator  are  well  matched. 

Another  technique  for  reducing  store  access  time  for  instruc- 
tions has  also  been  adopted.  This  permits  the  read  cycles  of  the 
two  stacks  to  start  assuming  that  the  same  page  will  be  referred 
to  as  in  the  previous  instruction  pair.  This,  of  course,  will  normally 
be  true  and  there  is  sufficient  time  to  take  corrective  procedures 
should  the  page  have  been  changed.  The  limit  of  1.2  fisec  per 
instruction  is  not  reduced  by  this  technique,  but  the  possibility 
of  reaching  this  limit  under  other  conditions  is  enhanced. 

A  schematic  diagram  of  the  practical  timing  of  a  sequence  of 
floating  point  addition  orders  is  shown  in  Fig.  6.  The  overlapping 
is  not  perfect  and  in  the  time  between  successive  instniction  pairs 
the  computer  is  obeying  four  instnictions  for  25  per  cent  of  the 
time,  three  for  56  per  cent  and  two  for  19  per  cent.  It  is  therefore 
to  be  expected  that  the  practical  time  for  the  complete  order  is 
greater  than  the  theoretical  minimum  time;  it  is  in  fact  approxi- 
mately 1.6  jusec. 

For  certain  types  of  functions  the  reading  of  the  next  pair  of 
instructions  before  completing  both  instnictions  of  the  first  pair 
would  be  incorrect,  e.g.,  fimctions  causing  transfer  of  control.  Such 
situations  are  recognized  during  the  fimction  decoding,  and  the 
request  for  the  next  instruction  pair  is  held  up  until  a  suitable 
time. 

In  a  sequence  of  floating  point  addition  orders  with  the  operand 
addresses  unmodified  the  limit  is  again  1.2  ;usec  while  the  time 
obtained  is  1.4  |iisec.  For  accumulator  orders  in  which  the  actual 
accumulator  operation  imposes  a  limit  in  excess  of  2  jusec  then 
the  actual  time  is  equal  to  this  limit. 

Perhaps  a  more  realistic  wav  of  defining  the  speed  of  the  com- 
puter is  to  give  the  time  for  a  typical  inner  loop  of  instructions. 
A  frequently  occurring  operation  in  matrix  work  in  the  formation 
of  the  scalar  product  of  two  vectors,  this  requires  a  loop  of  five 
instructions: 

Table  4    Effective  store  access  time 

Total  time 


Sequence  fsec 


1.  Request  In  Central  Machine  0 

2.  Request  In  Core  Stack  Coordinator  0.25 

3.  Equivalence  complete  and  request  made  to  selected 

stack  0.95 

4.  Information  In  Core  Stack  Coordinator  1.65 

5.  Information  in  Central  Machine  1.75 


Chapter  23  ,  One-level  storage  system  285 


1 

Copy 

1  To  1  Accumulator  busy 

requesr        c«  .;             request     p  . 
^1        1    tpuivolence      i  Keoa 

Copy 

I  to  1      Accumulotor  busy  | 

2 

Start  second  of  pair  Operond 

iF.tnrtinni  c  request 
huncTion  5  modification  i 
1  decode  i 

Stort       Instruction  Stack 

occ 

Stock 
request 

I     Equivalence     i  Reod 

Copy 

1  fr,  1     Acumutotor  busy  1 

3 

occ 

Operand  Stock 

next  pair      request    i.ti  request 

request  request 

1               1        1^1  Equivalence 

1     Read    1  decode  1  5  modification 

1        ]     Equivalence  | 

4 

Stort  second 
of  poir 
IFunctionl 

5 

1  decode  1  Smodificotion 

Start  Instruction 
next  pair  request 

6 

1                 1          jol  Equivoience 

Fig.  6.  Timing  diagratn  for  a  sequence  of  floating  point  addition  orders.  (Single-address  modification.) 


1  Element  of  first  vector  into  accunmlator.  (Operand  B-modi- 
fied.) 

2  Mviltiplv  accumulator  bv  element  of  second  vector.  (Oper- 
and B-modifled.) 

3  Add  partial  product  to  accumulator. 

4  Copy  accumulator  to  store  line  containini;  partial  product. 

5  Alter  count  to  select  next  elements  and  repeat. 

The  time  for  this  loop  with  instnictions  and  operands  on  the 
core  store  is  12.2  fisec.  The  value  of  the  overlapping  technique 
is  shown  by  the  fact  that  the  time  from  starting  the  first  instruction 
to  finishing  the  second  is  approximately  10  jusec. 

When  the  drum  or  tape  systems  are  transferring  information 
to  or  from  the  core  store  then  the  rate  of  obeying  instructions 
which  also  use  the  core  store  will  be  affected.  The  affect  is  dis- 
cussed in  more  detail  in  Appendix  1.  The  degree  of  slowing  down 
is  dependent  upon  the  time  at  which  a  dmm  or  tape  request  occurs 
relative  to  machine  requests.  It  also  depends  on  the  stacks  used 
by  the  drum  or  tape  and  those  being  used  by  the  central  machine. 
The  approximate  slowing  down  is  by  a  factor  of  25  per  cent  during 
a  drum  transfer  and  by  2  per  cent  for  each  active  tape  channel. 
(See  Appendix  1.) 

5.    The  drum  transfer  learning  program 

The  organization  of  drum  transfers  has  been  described  in  Sec.  2A. 
After  the  transfer  of  the  required  block  from  the  drxim  to  the  core 


store  has  been  initiated,  the  organizing  program  examines  the  state 
of  the  core  store,  and  if  empty  pages  still  exist,  no  further  action 
is  taken.  However,  if  the  core  store  is  full  it  is  necessary  to  arrange 
for  an  empty  page  to  be  made  available  for  use  at  the  next  non- 
equivalence.  The  selection  of  the  page  to  be  transferred  could  be 
made  at  random;  this  could  easilv  result  in  manv  additional  trans- 
fers occurring,  as  the  page  selected  could  be  one  of  those  in  current 
use  or  one  required  in  the  near  future.  The  ideal  selection,  which 
would  minimize  the  total  number  of  transfers,  could  only  be  made 
by  the  programmer.  To  make  this  ideal  selection  the  programmer 
would  have  to  know  ( 1 1  precisely  how  his  program  operated,  which 
is  not  always  the  case,  and  (2)  the  precise  amount  of  core  store 
available  to  his  program  at  any  instant.  This  latter  information 
is  not  generally  available  as  the  core  store  could  be  shared  by  other 
central  machine  programs,  and  almost  certainly  by  some  fixed  store 
program  organizing  the  input  and  output  of  information  from  slow 
peripheral  equipments.  The  amount  of  core  store  required  bv  this 
fixed  store  program  is  continuoush'  varying  [Kilbum  et  al.,  1961]. 
The  only  way  the  ideal  pattern  of  transfers  can  be  approached 
is  for  the  transfer  program  to  monitor  the  behavior  of  the  main 
program  and  in  so  doing  attempt  to  select  the  correct  pages  to 
be  transferred  to  the  drum.  The  techniques  used  for  monitoring 
are  subject  to  the  condition  that  they  must  not  slow  down  the 
operation  of  the  program  to  such  an  extent  that  they  offset  any 
reduction  in  the  number  of  transfers  required.  The  method  de- 
scribed occupies  less  than  1  per  cent  of  the  operating  time,  and 
the  reduction  in  the  number  of  transfers  is  more  than  sufficient 
to  cover  this. 


Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


That  part  of  the  transfer  program  which  organizes  the  selection 
of  the  page  to  be  transferred  has  been  called  the  "learning"  pro- 
gram. In  order  for  this  program  to  have  some  data  on  which  to 
operate,  the  machine  has  been  designed  to  supply  information 
about  the  use  made  of  the  different  pages  of  the  core  store  by 
the  program  being  monitored. 

With  each  page  of  the  core  store  there  is  associated  a  "use" 
digit  which  is  set  to  "1"  whenever  any  line  in  that  page  is  accessed. 
The  32  "use"  digits  exist  in  two  lines  of  the  V-store  and  can  be 
read  by  the  learning  program,  the  reading  automatically  resetting 
them  to  zero.  The  frequency  with  which  these  digits  are  read  is 
governed  by  a  clock  which  measures  not  real  time  but  the  number 
of  instructions  obeyed  in  the  operation  of  the  main  program.  This 
clock  causes  the  learning  program  to  copy  the  "use"  digits  to  a 
list  in  the  subsidiary  store  every  1024  instructions.  The  use  of  an 
instruction  counter  rather  than  a  normal  clock  to  measure  "time" 
for  the  learning  program  is  due  to  the  fact  that  the  operations 
of  the  main  program  may  be  interrupted  at  random  for  random 
lengths  of  time  by  the  operation  of  peripheral  equipments.  With 
an  instruction  counter  the  temporal  pattern  of  the  blocks  used 
will  be  the  same  on  successive  runs  through  the  same  part  of  the 
program.  This  is  essential  if  the  learning  program  is  to  make  use 
of  this  pattern  to  minimize  the  number  of  transfers. 

When  a  nonequivalence  occurs  and  after  the  transfer  of  the 
required  block  has  been  arranged,  the  learning  program  again  adds 
the  current  values  of  the  "use"  digits  to  the  list  and  then  uses 
this  list  to  bring  up  to  date  two  sets  of  times  also  kept  in  the 
subsidiary  store.  These  sets  consist  of  32  values  of  t  and  T,  one 
of  each  for  each  page  of  the  core  store.  The  value  of  t  is  the  length 
of  time  since  the  block  in  that  page  has  been  used.  The  value  of 
T  is  the  length  of  the  last  period  of  inactivity  of  this  block.  The 
accuracy  of  the  values  of  t  and  T  is  governed  bv  the  frequency 
with  which  the  "use"  digits  are  inspected. 

The  page  to  be  written  to  the  drum  is  selected  bv  the  appli- 
cation in  turn  of  three  simple  tests  to  the  values  of  t  and  7". 

1  Any  page  for  which  f  >  T  -H  1,  or 

2  That  page  with  t  ^0  and  (T  —  f)  max,  or 

3  That  page  with  T^^  (all  /  =  0). 

The  first  rule  selects  any  page  which  has  been  currently  out 
of  use  for  longer  than  its  last  period  of  inactivity.  Such  a  page 
has  probably  ceased  to  be  used  by  the  program  and  is  therefore 
an  ideal  one  to  be  transferred  to  the  dnmi.  The  second  rule  ignores 
all  pages  with  f  =  0  as  they  are  in  current  use,  and  then  selects 
the  one  which,  if  the  pattern  of  use  is  maintained,  will  not  be 


required  by  the  program  for  the  longest  time.  If  the  first  two  rules 
fail  to  select  a  page  the  third  ensures  that  if  the  page  finally 
selected  is  wrong,  in  that  it  is  immediately  required  again,  then, 
as  in  this  case,  T  will  become  zero  and  the  same  mistake  will  not 
be  repeated. 

For  all  the  blocks  on  the  drum  a  list  of  values  of  t  is  kept. 
The  values  of  t  are  set  when  the  block  is  transferred  to  the  dnmi: 

T  —  time  of  transfer — value  of  t  for  transferred  page 

When  a  block  is  transferred  to  the  core  store  the  value  of  t  is 
used  to  set  the  value  of  T. 

T  —  time  of  transfer — value  of  t  for  this  block 
=  length  of  last  period  of  inactivity 

For  the  block  transferred  from  the  drum  t  is  set  to  0. 

In  order  to  make  its  decision  the  learning  program  has  only 
to  update  two  short  lists  and  apply  at  the  most  three  simple  niles; 
this  can  easily  be  done  during  the  2  msec  transfer  time  of  the  block 
required  as  a  result  of  the  nonequivalence.  As  the  learning  program 
uses  only  fixed  and  subsidiarv  store  addresses  it  is  not  slowed  down 
during  the  period  of  the  drum  transfer. 

The  over-all  efficiency  of  the  learning  program  cannot  be 
known  until  the  complete  Atlas  system  is  working.  However,  the 
value  of  the  method  used  has  been  investigated  by  simulating  the 
behavior  of  the  one-level  store  and  learning  program  on  the 
Mercury  computer  at  Manchester  University.  This  has  been  done 
for  several  problems  using  varying  amounts  of  store  in  excess  of 
the  core  store  available.  One  of  these  was  the  problem  of  forming 
the  product  A  of  two  80th  order  matrices  B  and  C.  The  three 
matrices  were  stored  row  by  row  each  one  extending  over  14 
blocks,  only  14  pages  of  core  store  were  assumed  to  be  available. 
The  method  of  multiplication  was 

fcjj  X  1st  row  of  C  =  partial  answer  to  1st  row  of  A 
/>j2  X  2nd  row  of  C  -I-  partial  answer  =  second  partial  answer, 
etc. 

Thus  matrix  B  was  scanned  once,  matrix  C  80  times  and  each  row 
of  matrix  A  80  times. 

Several  machine  users  were  asked  to  spend  a  short  time  writing 
a  program  to  organize  the  transfers  for  a  general  matrix  multipli- 
cation problem.  In  no  case  when  the  method  was  applied  to  the 
above  problem  were  fewer  than  357  transfers  required.  A  program 
written  specifically  for  this  problem  which  paid  great  attention 
to  the  distribution  of  the  rows  of  the  matrices  relative  to  block 
divisions  required  234  transfers.  The  learning  program  required 
274  transfers;  the  gain  over  the  human  programmer  was  chiefly 


Chapter  23  |  One-level  storage  system  287 


due  to  the  fact  that  the  learning  program  could  take  full  advantage 
of  the  occasions  when  the  rows  of  A  existed  entirely  within  one 
block. 

Many  other  problems  involving  cyclic  running  of  single  or 
multiple  sets  of  data  were  simulated,  and  in  no  case  did  the  learn- 
ing program  require  more  transfers  than  an  experienced  human 
programmer. 

A.  Prediction  of  drum  transfers 

Although  the  learning  program  tends  to  reduce  the  number  of 
transfers  required  to  a  minimum,  the  transfers  which  do  occur  still 
internipt  the  operation  of  the  program  for  from  2  to  14  msec  as 
they  are  initiated  by  nonequivalence  interrupts.  Some  or  all  of 
this  time  loss  could  be  avoided  by  organizing  the  transfers  in 
advance.  A  very  experienced  programmer  having  sole  use  of  the 
core  store  could  arrange  his  own  transfers  in  such  a  way  that  no 
unnecessary  ones  ever  occurred  and  no  time  was  ever  wasted 
waiting  for  transfers  to  be  completed.  This  would  require  a  great 
deal  of  effort  and  would  onlv  be  worthwhile  for  a  program  that 
was  going  to  occupy  the  machine  for  a  long  time.  By  using  the 
data  accumulated  by  the  learning  program  it  is  possible  to  recog- 
nize simple  patterns  in  the  use  made  by  a  program  of  the  various 
blocks  of  the  one-level  store.  In  this  way  a  prediction  program 
could  forecast  the  blocks  required  in  the  near  hiture  and  organize 
the  transfers.  Bv  recording  the  success  or  failure  of  these  forecasts 
the  program  could  be  made  self-improving.  For  the  matrix  multi- 
plication problem  discussed  above  the  pattern  of  use  of  the  blocks 
containing  matrix  C  is  repeated  80  times,  and  a  considerable 
degree  of  success  could  be  obtained  with  a  simple  prediction 
program. 

6.  Conclusions 

A  specific  system  for  making  a  core-driun  store  combination  appear 
as  a  single  level  store  has  been  described.  While  this  is  the  actual 
system  being  built  for  the  Atlas  machine  the  principles  involved 
are  applicable  to  combinations  of  other  types  of  store.  For  e.xam- 
ple,  a  tunnel  diode-fast  core  store  combination  for  an  even  faster 
machine.  An  alternative  which  was  considered  for  Atlas,  but  which 
was  not  as  attractive  economically,  was  a  fast  core-slow  core  store 
combination.  The  system  too  can  be  extended  to  three  levels  of 
storage,  and  indeed  if  10^  words  of  total  storage  had  to  be  provided 
then  it  would  be  most  economical  to  provide  it  on  a  third  level 
of  store  such  as  a  file  dmm. 

The  automatic  system  does  require  additional  equipment  and 
introduces  some  complexity,  since  it  is  necessary  to  overlap  the 


time  taken  for  address  comparison  into  the  store  and  machine 
operating  time  if  it  is  not  to  introduce  any  extra  time  delays. 
Simulated  tests  have  shown  that  the  organization  of  drum  transfers 
are  reasonably  efficient  and  other  advantages  which  accrue,  such 
as  efficient  allocation  of  core  storage  between  different  programs 
and  store  lock  out  facilities  are  also  invaluable.  No  matter  how 
intelligent  a  programmer  may  be  he  can  never  know  how  many 
programs  or  peripheral  equipments  are  in  operation  when  his 
program  is  running.  The  advantage  of  the  automatic  system  is  that 
it  takes  into  account  the  state  of  the  machine  as  it  exists  at  any 
particular  time.  Furthermore  if  as  in  normal  use  there  is  some  sort 
of  regular  machine  rhythm  even  through  several  programs,  there 
is  the  possibility  of  making  some  sort  of  prediction  with  regard 
to  the  transfers  necessary.  This  involves  no  more  hardware  and 
will  be  done  by  program.  However,  this  stage  \\  ill  probably  be  left 
until  results  on  the  actual  system  are  obtained. 

It  can  be  seen  that  the  system  is  both  useful  and  flexible  in 
that  it  can  be  modified  or  extended  in  the  manner  previously 
indicated.  Thus  despite  the  increase  in  equipment,  the  advantages 
which  are  derived  completely  justify  the  building  of  this  automatic 
system. 

APPENDIX  1    ORGANIZATION  OF  THE  ACCESS  REQUESTS 
TO  THE  CORE  STORE 

There  are  three  sources  of  access  requests  to  the  core  store,  namely 
the  central  machine,  the  drum,  and  the  tape  systems.  In  deciding 
how  the  sequence  of  requests  from  all  three  sources  are  to  be 
serialized  and  placed  in  some  sort  of  order,  a  number  of  facts  have 
to  be  considered.  These  are 

1  .\11  three  sources  are  asynchronous  in  nature. 

2  The  dmm  and  tape  systems  can  make  requests  at  a  fairly 
high  rate  compared  with  the  store  cycle  time  of  approxi- 
mately 2  /usee.  For  e.xample.  the  dnnn  provides  a  request 
every  4  jxsec  and  the  tape  system  every  11  /jsec  when  all 
8  channels  are  operative. 

.3    The  dnnn  and  tape  systems  can  only  be  stopped  in  multiples 
of  a  block  length,  i.e.,  512  words.  This  means  that  any  system 
devised  for  accessing  the  core  store  must  deal  with  both 
the  average  rates  of  dnmi  and  tape  requests  specified  in  2. 
Only  the  central  machine  can  tolerate  requests  being  stopped 
at  any  time  and  for  anv  length  of  time.  From  these  facts  a 
request  priority  can  be  stated  which  is 
a    Dnmi  request. 
h    Tape  request. 
c    Central  machine  request. 


Part  3  I  The  instruction-set  processor  level:  variations  in  the  processor 


F  flip-  flop  f  rozen 


Inspect  stole  ot 
■  F  flip-flop 


Busy 


Woit  for 

equivalence 

completed 


Store  moctiine  order 

1 

Free  F  flip-flop 


Drum  tape  access 
to  core  store 


-Drum/tape  priority  - 


Remove  stack  request 
Intiibit  signols 


Is  there  o  stored 
moctiine  order  7 


Yes 


Drum/tope  request 

Permit  stack  request  f\ 
inhibits  to  reapply  V-^ 

Apply  inhibits  to 
stock  request  channels 
and  to  machine  request 
chonnels  (if  these  ore 
not  already  applied) 


Allow  to  proceed 
(if  possible) 


Stack  request  of 
stored  machine  order 


Hos  the  stack  reques 
of  0  stored  machine 
order  been  stopped 


1. 


Remove  inhibits 
on  inachine  request 
ctiannels 


4  A  machine  request  can  be  accepted  by  the  core  store,  but 
because  there  is  no  place  available  to  accept  the  core  store 
information,  its  cycle  is  inhibited  and  fiuther  requests  held 
up.  In  the  case  of  successive  division  orders  this  time  can 
be  as  long  as  20  jusec,  in  which  case  5  drum  requests  could 
be  made.  To  avoid  having  an  excessive  amount  of  buffer 
storage  for  the  drum  two  techniques  are  possible; 

a  When  dnuiis  or  tapes  are  operative  do  not  permit  ma- 
chine requests  to  be  accepted  until  there  is  a  place 
available  to  put  the  information. 

h  Store  the  machine  request  and  then  permit  a  drum  or 
tape  request. 

The  latter  scheme  has  been  adopted  because  it  can  be 
accommodated  more  conveniently  and  it  saves  a  small 
amount  of  time. 

5  If  the  central  machine  is  using  the  private  store  then  it  is 
desirable  for  drum  and  tape  transfers  to  the  core  store  not 
to  interfere  with  or  slow  down  the  central  machine  in  any 
way. 

6  When  the  central  machine,  drum  and  tape  are  sharing  the 
core  store  then  the  loss  of  central  machine  speed  should 
be  roughly  proportional  to  the  activity  of  the  drum  or  tape 
systems.  This  means  that  drum  or  tape  requests  must 
"break"  into  the  normal  machine  request  channel  as  and 
when  required. 

The  system  which  accommodates  all  these  points  is  now  dis- 
cussed. Whenever  a  drum  or  tape  request  occurs  inhibit  signals 
are  applied  to  request  channel  into  the  core  stack  coordinator  and 
also  to  the  stack  request  channels  from  this  coordinator.  This 
results  in  a  "freezing"  of  the  state  of  flip-flop  F  (Fig.  5)  and  this 
state  is  then  inspected  (Fig.  7,  point  .V).  If  the  state  is  "busy"  this 
means  that  a  machine  order  has  been  stopped  somewhere  between 
the  loading  of  the  buffer  address  register  (B.A.R.)  and  the  stack 
request.  Normally  this  time  interval  can  vary  from  about  0.5  jusec 
if  there  are  no  stack  request  holdups,  to  20  /xsec  in  the  case  of 
certain  accumulator  holdups.  In  either  case  sufficient  time  is  al- 
lowed after  the  inspection  to  ensure  that  the  equivalence  operation 
has  been  completed.  If  an  equivalence  indication  is  obtained  all 
the  information  relevant  to  this  machine  order  (i.e.,  the  line  ad- 
dress, page  digits,  stack(s)  required  and  type  of  stack  order)  are 
stored  for  futiue  reference.  Use  is  made  here  of  the  page  digit 
register  provided  to  allow  the  by-pass  on  the  equivalence  circuitry 
for  instruction  accesses.  The  core  store  is  then  made  free  for  access 
by  the  drum  or  the  tape.  If  the  core  store  had  been  found  to  be 
free  on  inspection,  the  above  procedure  is  omitted. 


Fig.  7.  Drum  and  tape  break  in  systems. 

A  drum  or  tape  access  (as  decided  by  the  priority  circuit)  to 
the  core  store  then  occurs,  which  removes  the  inhibits  on  the  stack 
request  channels.  When  the  stack  request  for  the  drum  or  tape 
cycle  is  initiated  these  inhibits  are  allowed  to  reapply.  At  this  stage 
(Fig.  7,  point  Y),  if  there  is  a  stored  machine  order  it  is  allowed 
to  proceed  if  possible.  The  inhibits  on  the  machine  request  chan- 
nels are  removed  when  the  stack  request  for  the  stored  machine 
order  occurs.  If  there  is  no  stored  machine  order  this  is  done 


Chapter  23  |  One-level  storage  system  289 


ininiediatelv.  and  the  central  machine  is  again  allowed  access  to 
the  core  store.  However,  another  drum  or  tape  request  can  arise 
before  the  stack  request  of  the  stored  machine  order  occurs,  in 
particular  because  this  latter  order  may  still  be  held  up  by  the 
central  machine.  If  this  is  the  case  the  drum  or  tape  is  allowed 
immediate  access  and  a  further  attempt  is  made  to  complete  the 
stored  machine  order  when  this  drum  or  tape  stack  request  occurs. 

If  the  stored  machine  order  was  for  an  operand,  the  content 
nf  the  page  digit  register  will  correspond  to  the  location  of  this 
operand.  The  next  machine  request  for  an  instruction  pair  will 
then  almost  certainly  result  in  a  "wrong  page"  indication.  This 
is  prevented  by  arranging  that  the  next  instruction  pair  access  does 
not  by-pass  the  equivalence  circuitry. 

The  effect  on  the  machine  speed  when  the  dnim  or  tapes  are 
transferring  information  to  or  from  the  core  store  is  dependent 
upon  two  factors.  First,  upon  the  proportion  of  time  during  which 
the  buffer  register  in  the  core  coordinator  is  busy  dealing  with 
machine  requests,  and  secondly,  upon  the  particidar  stacks  being 
used  by  the  central  machine  and  the  drum  or  tape.  If  the  computer 
is  obeying  a  program  with  instructions  and  operands  on  the  fixed 
or  subsidiary  store  then  the  rate  of  obeying  instructions  is  un- 
affected bv  dnun  or  tape  tran.sfers.  .\  dmm  or  tape  internipt 
occurring  when  the  B..\.R.  is  free  prevents  any  machine  address 
being  accepted  onto  this  buffer  for  1.0  jusec.  However,  if  the  B..\.R. 
is  busv  then  the  next  machine  request  to  the  core  store  is  delayed 
until  1.8  fisec  after  the  internipt  if  different  stacks  are  being  used, 
or  imtil  .3.4  /isec  after  the  interrupt  if  the  stacks  are  the  same. 

When  the  machine  is  obeying  a  program  with  instnictions  and 
operands  on  the  core  store  the  slowing  down  during  drum  transfers 
can  be  by  a  factor  of  two  if  instructions,  operands,  and  dmm 
requests  use  the  same  stacks.  It  is  also  possible  for  the  machine 
to  be  unaffected.  The  effect  on  a  particular  sequence  of  orders 
can  be  seen  by  considering  the  one  discussed  in  Sec.  4  and  illus- 
trated in  Fig.  6.  In  this  sequence  the  instructions  are  on  stacks 
0  and  1  while  the  operands  are  on  stacks  2  and  3.  If  the  drum 
or  tape  is  transferring  alternately  to  stacks  0  and  1  then  the  effect 
of  any  interrupt  within  the  3.2  j^sec  of  an  instmction  pair  is  to 
increase  this  time  bv  between  0.5  and  3.4  |usec  depending  upon 
where  the  interrupt  occurred.  The  average  increase  is  1.8  jusec 
and  for  a  tape  transfer  with  interrupts  everv  88  ;usec  the  computer 
can  obey  instructions  at  98  per  cent  of  the  normal  rate.  During 
dmm  transfers  the  interrupts  occur  every  4  fisec  which  would 
suggest  a  slowing  down  to  60  per  cent  of  normal.  However,  for 
any  regular  sequence  of  orders  the  requests  to  the  core  store  by 
the  machine  and  by  the  drum  rapidly  become  synchronized  with 


the  result  in  this  particular  case  that  the  machine  can  still  operate 
at  80  per  cent  of  its  normal  speed. 

APPENDIX  2    METHODS  OF  DIVISION  OF  THE  MAIN 
CORE  STORE 

The  maximum  frequencv  with  which  requests  can  be  dealt  with 
by  a  single  stack  core  store  is  governed  by  the  cycle  time  of  the 
store.  If  the  store  is  divided  into  several  stacks  which  can  be  cycled 
independently  then  the  limit  imposed  on  the  speed  of  the  machine 
by  the  core  store  is  reduced.  The  degree  of  division  which  is  chosen 
is  dependent  upon  the  ratio  of  core  store  cycle  time  to  other 
machine  operations  and  also  upon  the  cost  of  the  multiple  selec- 
tion mechanisms  required. 

Considering  a  sequence  of  orders  in  which  both  the  instmction 
and  operand  are  in  the  core  store,  then  for  a  single  stack  store 
the  limit  inipo.sed  on  the  operating  speed  bv  the  store  is  two  cycle 
times  per  order,  i.e..  4  fisec  in  Atlas.  This  is  significantly  larger 
than  the  limits  imposed  by  other  sections  of  the  computer 
(Sec.  4).  If  the  store  is  divided  into  two  stacks  and  instmctions  and 
operands  are  separated,  then  the  limit  is  reduced  to  2  usee  which 
is  still  rather  high.  The  provision  of  two  stacks  permits  the  ad- 
dressing of  the  store  to  be  arranged  so  that  successive  addresses 
are  in  alternate  stacks.  It  is  therefore  possible  by  making  requests 
to  both  stacks  at  the  same  time  to  read  two  instructions  together, 
so  reducing  the  number  of  access  times  to  three  per  instmction 
pair.  Unfortunately  such  an  arrangement  of  the  store  means  that 
operands  are  always  on  the  same  stacks  as  instmction  pairs,  and 
the  limit  imposed  by  the  cycle  time  is  still  2  jusec  per  order  even 
if  the  two  operand  requests  in  the  instmction  pair  are  to  different 
stacks  and  occur  at  the  same  time. 

Division  into  anv  number  of  stacks  with  the  addressing  svstem 
working  through  each  stack  in  turn  cannot  reduce  the  limit  below 
2  fisec  since  successive  instmctions  nomiallv  occur  in  successive 
addresses  and  are  therefore  in  the  same  stack.  However,  four  stacks 
arranged  in  two  pairs  reduces  the  limit  to  1  [isec  as  the  operands 
can  always  be  arranged  to  be  on  different  stacks  from  the  instmc- 
tion pairs.  In  order  to  reduce  the  limit  to  0..5  /jsec  it  is  necessarv 
to  have  eight  stacks  arranged  in  two  sets  of  four  and  to  read  four 
instmctions  at  once,  which  would  increase  the  complexitv  of  the 
central  machine. 

The  limit  of  1  fisec  is  quite  sufficient  and  further  division  with 
the  stacks  arranged  in  pairs  only  enables  the  limit  to  be  more  easily 
obtained  by  suitable  location  of  the  instmctions  and  operands. 

The  location  of  instmctions  and  operands  within  the  core  store 
is  under  the  control  of  the  dmm  transfer  program;  thus  when  there 


290  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


4  0    One  stack 

2  5   Two  stacks 

(pair) 

1 

2  0   Two  stacks  (single) 



15  Four  stacks 

(pair)  — ■  

117  Eight  stacks 

(pair) 

40 


0  8  16  24  32 


Number  ot  pages  of  operonds 


Fig.  8.  Limit  imposed  by  cycle  time  on  operating  speed  for  different 
divisions  of  the  core  store. 


are  several  stacks  instructions  and  operands  are  separated 
wherever  possible.  Under  these  conditions  it  is  possible  to  calculate 
the  limit  imposed  on  the  operating  speed  by  the  cycle  time  for 
different  divisions  of  the  core  store.  The  results  are  shown  in 
Fig.  8,  for  stacks  arranged  in  pairs  instructions  are  read  in  pairs  and 
in  all  cases  both  instructions  and  operands  are  assumed  to  be  on  the 
core  store.  Operands  are  assumed  to  be  selected  at  random  from  the 
operand  space,  for  instance  in  the  case  of  two  stacks  arranged  as  a 
pair,  successive  operand  requests  have  equal  probability  of  being  to 
the  same  stack  or  to  alternate  stacks. 

The  limit  imposed  by  a  four  stack  store  is  never  severe  com- 
pared with  other  limitations,  for  example  the  sequence  of  floating 
point  addition  orders  discussed  in  Sec.  4  required  1.6  |usec  per  order 
with  ideal  distribution  of  instructions  and  operands.  Division  into 
eight  stacks,  although  it  reduces  the  limit,  will  not  have  an  equiv- 
alent efiFect  on  the  over-all  operating  speed,  and  such  a  division 
was  not  considered  to  be  justified. 

References 


KilbT62;  BrooR60;  EdwaD60;  KilbT56;  60a,  60b.  61;  LonsK56;  PapiW57; 
FothJ61;  HartD68;  HowaD61;  62,  63:  MorrDS";  SumnF62 


Chapter  24 

A  user  machine  in  a  time-sharing 
system^ 

B.  W.  Lampson  /  W.  W.  Lichtenberger  /  M.  W.  Pirtle 

Summary  This  paper  describes  the  design  of  the  computer  seen  by  a 
machine-language  programmer  in  a  time-sharing  system  developed  at  the 
University  of  California  at  Berkeley.  Some  of  the  instructions  in  this  machine 
are  executed  by  the  hardware,  and  some  are  implemented  by  software. 
The  user,  however,  thinks  of  them  all  as  part  of  his  machine,  a  machine 
having  extensive  and  unusual  capabilities,  many  of  which  might  be  part 
of  the  hardware  of  a  (considerably  more  expensive)  computer. 

Among  the  important  features  of  the  machine  are  the  arithmetic  and 
string  manipulation  instructions,  the  very  general  memory  allocation  and 
configuration  mechanism,  and  the  multiple  processes  which  can  be  created 
by  the  program.  Facilities  are  provided  for  communication  among  these 
processes  and  for  the  control  of  exceptional  conditions. 

The  input-output  system  is  capable  of  handling  all  of  the  peripheral 
equipment  in  a  uniform  and  convenient  manner  through  tiles  having  sym- 
bolic names.  Programs  can  access  files  belonging  to  a  number  of  people, 
but  each  person  can  protect  his  own  files  from  unauthorized  access  by 
others. 

Some  mention  is  made  at  various  points  of  the  techniques  of  implemen- 
tation, but  (he  main  emphasis  is  on  the  appearance  of  the  Jiser's  machine. 

Introduction 

A  characteristic  of  a  time-sharing  system  is  that  the  computer  seen 
by  the  user  programming  in  machiire  language  differs  from  that 
on  which  the  system  is  implemented  [Bright,  1964;  Comfort,  1965; 
Forgie,  1965;  McCullogh  et  al.,  1965;  Schwartz,  1964].  In  fact, 
the  user  machine  is  defined  by  the  combination  of  the  time-sharing 
hardware  nmning  in  user  mode  and  the  software  which  controls 
input-output,  deals  with  illegal  actions  which  may  be  taken  by 
a  user's  program,  and  proyides  yarious  other  services.  If  the  hard- 
ware is  arranged  in  such  a  way  that  calls  on  the  system  have  the 
same  form  as  the  hardware  instructions  of  the  machine  [Lichten- 
berger and  Pirtle,  1965],  then  the  distinction  becomes  irreleyant 
to  the  user;  he  simply  programs  a  machine  with  an  unusual  and 
powerful  instniction  set  which  relieves  him  of  many  of  the  prob- 
lems of  conventional  machine-language  programming  [Lampson, 
1965;  McCarthy  et  al,  196.3]. 

iProc.  IEEE,  54,  vol.  12,  pp.  1766-1774,  December,  1966. 


In  a  time-sharing  system  which  has  been  developed  by  and  for 
the  use  of  members  of  Project  Genie  at  the  L'niversity  of  C^alifornia 
at  Berkeley  [Lichtenberger  and  Pirtle,  1965],  the  user  machine 
has  a  number  of  interesting  characteristics.  The  computer  in  this 
system  is  an  SDS  9.i(),  a  24  bit,  fi.xed-point  machine  with  one  index 
register,  multi-level  indirect  addressing,  a  14  bit  address  field,  and 
32  thousand  words  of  1.75  fis  memory  in  two  independent  modules. 
Figure  I  shows  the  basic  configuration  of  equipment.  The  memory- 
is  interleaved  between  the  two  modules  so  that  processing  and 
dnun  transfers  may  occur  simultaneously.  A  detailed  description 
of  the  yarious  hardware  modifications  of  the  computer  and  their 
implications  for  the  performance  of  the  overall  system  has  been 
given  in  a  previous  paper  [Lichtenberger  and  Pirtle,  1965], 

Briefly,  these  modifications  include  the  addition  of  monitor  and 
user  modes  in  which,  for  user  mode,  the  execution  of  a  class  of 
instructions  is  prevented  and  replaced  by  a  trap  to  a  system  rou- 
tine. The  protection  from  unauthorized  access  to  memory  has  been 
subsumed  in  an  address  mapping  scheme:  both  the  16  .384  words 
addressable  by  a  user  program  (logical  addresses)  and  the  32  768 
words  of  actual  core  memory  (physical  addresses)  have  been 
di\  ided  into  2()48-word  pages.  \  set  of  eight  six-bit  hardware  regis- 
ters defines  a  map  from  the  logical  address  space  to  the  real  memor\' 
by  specifying  the  real  page  which  is  to  correspond  to  each  of  the 
user's  logical  pages.  Implicit  in  this  scheme  is  the  capability  of 
marking  each  of  the  user's  pages  as  imassigned  or  read-only,  so  that 
any  attempt  to  access  such  a  page  improperly  w-ill  result  in  a  trap. 

.\11  memory  references  in  user  mode  are  mapped.  In  monitor 
mode,  all  memory  references  are  normally  absolute.  It  is  possible, 
however,  with  any  instruction  in  monitor  mode,  or  even  within 
a  chain  of  indirect  addressing,  to  specify  use  of  the  user  map. 
Furthermore,  in  monitor  mode  the  top  4096  words  are  mapped 
through  two  additional  registers  called  the  monitor  map.  The 
mapping  process  is  illustrated  in  Fig.  2. 

Another  significant  hardware  modification  is  the  mechanism  for 
going  between  modes.  Once  the  machine  is  in  user  mode,  it  can 
get  to  monitor  mode  vmder  three  circumstances: 


291 


292  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


TTY 
interface 


P,T 
reader 


Drum 
I/O 
processor 


Teletypes 


-a 


CPU 
SDS  930 
modified 


I  3^10^  WORDS 
5M0^  WDS/SEC 


Memory 

16  K 
I  75;j  sec 


Memory 

16  K 

1-75  ^J■  sec 


Mass 
store 
I  SxlO^words 


General 

I/O 
processor 


" — I    I  Graphic 
^ — '  disploy 


J  1  Keyboard 
PDP-5  disploys 


(planned) 

Graphic 
display 
and 

light  pen 


Remote 
computers 


Fig.  1.  Configuration  of  equipment. 


1  If  a  hardware  internipt  occurs 

2  If  a  trap  is  generated  by  the  user  program  as  outlined. 

3  If  an  instmction  with  a  particular  configuration  of  two  bits 
is  executed.  Such  an  instruction  is  called  a  system  pro- 
grammed operator  (SYSPOP). 

In  case  .3,  the  six-bit  operation  field  is  used  to  select  one  of  64 
locations  in  absolute  core.  The  current  address  of  the  instruction 
is  put  into  absolute  location  zero  as  a  subroutine  link,  the  indirect 
address  bit  of  this  link  word  is  set,  and  another  bit  is  set,  marking 
the  memory  location  in  the  link  word  as  having  come  from  user- 
mapped  memory.  The  system  routine  thus  invoked  may  take  a 
parameter  from  the  word  addressed  by  the  SYSPOP,  since  its 
address  field  is  not  interpreted  by  the  hardware.  The  routine  will 


address  the  parameter  indirectly  through  location  zero  and,  be- 
cause of  the  bit  marking  the  contents  of  location  zero  as  having 
come  from  user  mode,  the  user  map  will  be  applied  to  the  re- 
mainder of  the  address  indirection.  AW  calls  on  the  system  which 
are  not  inadvertent  are  made  in  this  way. 

A  monitor  mode  program  gets  into  user  mode  by  transferring 
to  an  address  with  mapping  specified.  This  means,  among  other 
things,  that  a  SYSPOP  can  return  to  the  user  program  simply  by 
branching  indirect  through  location  zero. 

As  the  above  discussion  has  perhaps  indicated,  the  mode- 
changing  arrangements  are  very  clean  and  permit  rapid  and  natu- 
ral transfers  of  control  between  user  and  system  programs.  Advan- 
tage has  been  taken  of  this  fact  to  create  a  rather  grandiose 
machine  for  the  user.  Its  features  are  the  subject  of  this  paper. 

Basic  features  of  the  machine 

A  user  in  the  Berkeley  time-sharing  system,  working  at  what  he 
thinks  of  as  the  hardware  language  level,  has  at  his  disposal  a 
machine  with  a  configuration  and  capability  which  can  be  con- 
veniently controlled  by  the  execution  of  machine  instruction  se- 
quences. Its  simplest  configuration  is  very  similar  to  that  of  a 


lOliOOl  10101  100 


|oo I ooliool  1 0  I  0  n  oo| 


32  K  real  core 

Virtual  effective  address  246545 
Mapping  register  5  Us 

Real  effective  address  44654b 

Read-only  bit  off 


Fig.  2.  The  hardw/are  memory  map.  (a)  Relation  between  virtual  and  real 
memory  for  a  typical  map.  (b)  Construction  of  a  real  memory  address. 


Chapter  24  |  A  user  machine  in  a  time-sharing  system  293 


standard  niediuni-sized  computer.  In  this  configuration,  the 
machine  possesses  the  standard  930  complement  of  arithmetic  and 
logic  instructions  and,  in  addition,  a  set  of  software  interpreted 
monitor  and  executive  instructions.  The  latter  instructions,  which 
will  he  discussed  more  fully  in  the  following,  do  rather  complex 
input-output  of  many  different  kinds,  perform  many  frequently 
used  table  lookup  and  string  processing  functions,  implement 
floating  point  operations,  and  provide  for  the  creation  of  more 
complex  machine  configurations.  Some  examples  of  the  instructions 
available  are: 

1  Load  A,  B,  or  X  (index)  registers  from  memory  or  store  any 
of  the  registers.  Indexing  and  indirect  addressing  are  avail- 
able on  these  and  almost  all  other  instructions.  Double  word 
load  and  store  are  also  available. 

2  The  normal  complement  of  fixed-point  arithmetic  and  logic 
operations. 

3  Skips  on  various  arithmetic  and  logic  conditions. 

4  Floating  point  arithmetic  and  input-output.  The  latter  is 
in  free  format  or  in  the  equivalent  of  Fortran  E  or  F  format. 

5  Input  a  character  from  a  teletype  or  write  a  block  of  arbi- 
trary length  on  a  drum  file. 

6  Look  up  a  string  in  a  hash-coded  table  and  obtain  its  posi- 
tion in  the  table. 

7  Create  a  new  process  and  start  it  running  concurrently  \\  ith 
the  present  one  at  a  specified  point. 

8  Redefine  the  memory  of  the  machine  to  include  a  portion 
of  that  which  is  also  being  used  h\  another  program. 

It  should  be  emphasized  that,  although  many  of  these  instnic- 
tions  are  software  interpreted,  their  format  is  identical  to  the 
standard  machine  instruction  format,  with  the  exception  of  the 
one  bit  which  specifies  a  system  interpreted  instrviction.  Since  the 
system  interpretation  of  these  instructions  is  completely  invisible 
to  the  machine  user,  and  since  these  instructions  do  have  the 
standard  machine  instruction  format,  the  user  and  his  program 
make  no  distinction  between  hardware  and  software  interpreted 
instructions. 

Some  of  the  possible  192  operation  codes  are  not  legal  in  the 
user  machine.  Included  in  this  category  are  those  hardware  in- 
structions which  would  halt  the  machine  or  interfere  with  the 
input-output  if  allowed  to  execute,  and  those  software  interpreted 
instructions  which  attempt  to  do  things  which  are  forbidden  to 
the  program.  .Attempted  execution  of  one  of  these  instructions  will 


result  in  an  illeaal  imtruction  violation.  The  effect  of  an  illegal 
instruction  violation  is  described  later. 

Memory  configuration 

The  memory  size  and  organization  of  the  machine  is  specified  by 
an  appropriate  sequence  of  instructions.  For  example,  the  user  may 
specify  a  machine  which  has  6K  of  memory  with  addresses  from 
0  to  13777,,;  alternatively,  he  may  specify  that  the  6K  should 
include  addresses  0  to  .3777^,,  14(X)0g  to  17777^,  and  .34CXK)^  to 
■37777^.  The  user  may  also  specify  the  size  and  configuration  of 
the  machine's  secondary  storage  and,  to  a  considerable  extent,  the 
stnicture  of  its  input-output  system.  A  full  discussion  of  this  capa- 
bility will  be  deferred  to  a  later  section. 

The  next  few  paragraphs  discuss  the  mechanism  bv  which  the 
user's  program  may  specify  its  memory  size  and  organization.  This 
mechanism,  known  as  the  process  map  to  distinguish  it  from  the 
hardware  memory  address  mapping,  uses  a  (software)  mapping 
register  consisting  of  eight  6-bit  bytes,  one  byte  for  each  of  the 
eight  2K  blocks  addressable  by  the  14  bit  address  field  of  an  in- 
struction. Each  of  these  bytes  either  is  0  or  addresses  one  of  the 
63  words  in  a  table  called  the  private  memory  table  (PMT).  Each 
user  has  his  own  private  memory  table.  An  entry  in  this  table 
provides  information  about  a  particular  2K  block  of  memory.  The 
block  may  be  either  local  to  the  user  or  it  may  be  shared.  If  the 
block  is  local,  the  entry  gives  information  about  whether  it  is 
currently  in  core  or  on  the  drum.  This  information  is  important 
to  the  system  but  need  not  concern  the  user.  If  the  block  is  shared, 
its  W/r  entry  points  to  an  entry  in  another  table  called  the  shared 
memory  table  iSMT).  Entries  in  this  table  describe  blocks  of 
memory  which  are  shared  by  several  users.  Such  blocks  may  con- 
tain invariant  programs  and  constants,  in  which  case  they  will  be 
marked  as  read-only,  or  they  may  contain  arbitrary  data  which 
is  being  processed  bv  programs  belonging  to  two  different  users. 

A  possible  arrangement  of  logical  or  virtual  memor\-  for  a 
process  is  shown  in  Fig.  3.  The  nature  of  each  page  has  been  noted 
in  the  picture  of  the  virtual  memory;  this  information  can  also 
be  obtained  by  taking  the  corresponding  bvte  of  the  map  and 
looking  at  the  PMT  entry  specified  bv  that  byte.  The  figure  shows 
a  large  amount  of  shared  memory,  which  suggests  that  the  process 
might  be  a  compilation,  sharing  the  code  for  the  compiler  with 
other  processes  translating  programs  written  in  the  same  source 
language.  Virtual  pages  one  and  two  might  hold  tables  and  tem- 
porary storage  which  are  unique  to  each  separate  compilation. 
Note  that,  although  the  flexibility  of  the  map  allows  any  block 
of  code  or  data  to  appear  anywhere  in  the  virtual  memory,  it  is 
certainly  not  true  that  a  program  can  nm  regardless  of  which  pages 


294  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


Page 
0 
( 

2 
3 
4 
5 

6 
7 


Entry  block 


SHARED  BL6 


SHARED  BL2 


UNASSIGNED 


SHARED  BL  3 


UNASSIGNED 


1 

M3 

2 

M4 

3 

M5 

4 

SMTI 

5 

SMT4 

6 

SMT2 

7 

MI2 

8 

SMT6 

9 

SMT3 

10 

0 

16  K  virtual 
memory 


Process 
mop 


Private 
nemory  table 


Fig.  3.  Layout  of  virtual  memory  for  a  typical  process. 


it  is  in.  In  particular,  if  it  contains  references  to  itself,  such  as 
branch  instnictions,  then  it  must  run  in  the  same  virtual  pages 
into  which  it  was  loaded. 

Two  instructions  are  provided  which  permit  the  user  to  read 
and  modify  his  process  map.  The  ability  to  read  the  process 
mapping  registers  permits  the  user  to  obtain  the  current  memory 
assignment,  and  the  ability  to  write  the  registers  permits  him  to 
reassign  memory  in  any  way  which  suits  his  fancy.  The  system 
naturally  checks  each  new  map  as  it  is  established  to  ensure  that 
the  process  is  not  attempting  to  obtain  unauthorized  access  to 
memory  which  does  not  belong  to  it. 

When  the  user's  process  is  initiated,  it  is  assigned  only  enough 
memory  to  contain  the  program  data  as  initially  loaded.  For  in- 
stance, if  the  program  and  constants  occupy  3()00g  words,  two 
blocks,  say  blocks  0  and  1,  will  be  assigned.  At  this  point,  the  first 
two  bytes  of  the  process  mapping  register  will  be  nonzero;  the 
others  will  be  zero.  When  the  program  rims,  it  may  address  memory 
outside  of  the  first  4K.  If  it  does,  and  if  the  user  has  specified  a 
machine  size  larger  than  4K,  a  new  block  of  memory  will  be 
assigned  to  him  which  makes  the  formerly  illegal  reference  legal. 
In  this  way,  the  user's  process  may  obtain  more  memory.  In  fact, 
it  may  easily  obtain  more  than  16K  of  memory  simply  by  ad- 
dressing 16K,  reading  and  preserving  the  process  mapping  register, 
setting  it  with  some  of  the  bytes  cleared  to  zero,  and  grabbing 
some  more  memory.  Of  course,  only  16K  can  be  addressed  at  one 
time;  this  is  a  limitation  imposed  by  the  address  field  of  the 
machine. 


There  is  an  instruction  which  allows  a  process  to  specify  the 
maximum  amount  of  memory  which  it  is  allowed  to  have.  If  it 
attempts  to  obtain  more  than  this  amount,  a  menwnj  violation  will 
occur.  A  memory  violation  can  also  be  caused  by  attempts  to 
transfer  into  or  indirect  through  unassigned  memory,  or  to  store 
into  read-only  memory.  The  effect  of  this  violation  is  similar  to 
the  effect  of  an  illegal  instruction  violation  and  will  be  discussed. 

The  facilities  just  described  are  entirely  sufficient  for  programs 
which  need  to  reorganize  the  machine's  memory  solely  for  internal 
purposes.  In  many  cases,  however,  the  program  wishes  to  obtain 
access  to  memory  blocks  which  have  been  created  by  the  system 
or  by  other  programs.  For  example,  there  may  be  a  package  of 
mathematical  and  utility  routines  in  the  system  which  the  program 
would  like  to  use.  To  accommodate  this  requirement,  there  is  an 
instniction  which  establishes  a  relationship  between  a  name  and 
a  certain  process  mapping  function.  This  instruction  moves  the 
PMT  entries  for  the  blocks  addressed  by  the  specified  process 
mapping  function  into  the  shared  memory  table  so  that  they  are 
generally  accessible  to  all  users.  Once  this  correspondence  has 
been  established,  there  is  another  instruction  which  allows  a 
different  user  to  deliver  the  name  and  obtain  in  return  the  associ- 
ated process  map.  This  instruction  will,  if  necessary,  make  new 
entries  in  the  second  user's  PMT.  Various  subsystems  and  programs 
of  general  interest  have  names  permanently  assigned  to  them  by 
the  system. 

The  user  machine  thus  makes  it  possible  for  a  number  of  proc- 
esses belonging  to  independent  users  to  run  with  memory  which 
is  an  arbitrary  combination  of  blocks  local  to  each  individual 
process,  blocks  shared  between  several  processes,  and  blocks  per- 
manently available  in  the  system.  A  complex  configuration  is 
sketched  in  Fig.  4.  Process  1.1  was  shown  in  more  detail  in 
Fig.  .3.  Each  box  represents  a  process,  and  the  numbers  within  rep- 
resent the  eight  map  bytes.  The  arrows  between  processes  show  the 
process  hierarchy,  which  is  discussed  in  the  next  section.  Note  that 
the  PMT's  belong  to  the  users,  not  to  the  processes. 

From  the  above  discussion,  it  is  apparent  that  the  user  can 
manipulate  the  machine  memory  configuration  to  perform  simple 
memory  overlays,  to  change  data  bases,  or  to  perform  other  more 
complex  tasks  requiring  memory  reconfiguration.  For  example,  the 
use  of  common  routines  is  greatly  facilitated,  since  it  is  necessary 
only  to  adjust  the  process  map  so  that  (1)  memory  references 
internal  and  external  to  the  common  routine  are  correct,  and  (2) 
the  memory  area  in  which  the  routine  resides  is  read-only.  In  the 
simplest  case,  in  which  the  common  routine  and  the  data  base 
fit  into  16K  of  memory,  the  map  is  initially  established  and  remains 
static  throughout  the  execution  of  the  routine.  In  other  cases  where 


Chapter  24  |  A  user  machine  in  a  time-sharing  system  295 


the  routine  and  data  base  do  not  fit  into  16K,  or  where  several 
common  routines  are  concurrently  employed,  it  may  be  necessary 
to  make  frequent  adjustment  to  the  map  during  execution. 

Multiple  processes 

An  important  feature  of  the  user  machine  allows  the  user  program, 
which  in  the  current  context  will  be  referred  to  as  the  controlling 
process,  to  establish  one  or  more  subsidiary  processes.  With  a  few 
minor  e.xceptions,  to  be  discussed,  each  subsidiary  process  has  the 
same  status  as  the  controlling  process.  Thus,  it  may  in  turn  estab- 
lish a  subsidiary  process.  It  is  therefore  apparent  that  the  user 
machine  is  in  fact  a  multi-processing  machine.  The  original  sug- 
gestion which  gave  rise  to  this  capability  was  made  by  Conwa\ 
[Conway,  196.3],  more  recently  the  Multics  system  has  included 
a  multi-process  capability  [Corbato  and  Vyssotsky,  1965;  Dennis 
and  Van  Horn,  1966;  Saltzer,  1966]. 

A  process  is  the  logical  environment  for  the  execution  of  a 
program,  as  contrasted  to  the  physical  environment,  which  is  a 
hardware  processor.  It  is  defined  h\'  the  information  which  is  re- 
quired for  the  program  to  run;  this  information  is  called  the  state 
vector.  To  create  a  new  process,  a  given  process  executes  an  in- 
stmction  which  has  argiunents  specifying  the  state  vector  of  the 
new  process.  This  state  vector  includes  the  program  counter,  the 
central  registers,  and  the  process  map.  The  new  process  mav  have 
a  memory  configination  which  is  the  same  as,  or  completely  differ- 
ent from,  that  of  the  originating  process.  The  only  constraint 
placed  on  this  memory  specification  is  that  the  total  memory 
available  to  the  multi-process  system  is  limited  to  12SK  by  the 
process  mapping  mechanism,  which  is  common  to  all  processes. 
Each  user,  of  course,  has  his  own  128K. 

This  facility  was  put  into  the  system  so  that  the  system  could 
control  the  user  processes.  It  is  also  of  direct  value,  however,  for 
many  user  processes.  The  most  obvious  examples  are  input-output 
buffering  routines,  which  can  operate  independently  of  the  user's 
main  program,  communicating  with  it  through  memory  and  with 
intermpts  (see  the  following).  Whether  the  operation  being  buff- 
ered is  large  volume  output  to  a  disc  or  teletype  requests  for 
information  about  the  progress  of  a  mnning  program,  the  degree 
of  flexibility  afforded  by  multiple  processes  far  exceeds  anything 
which  could  have  been  built  into  the  input-output  system.  Fur- 
thermore, the  overhead  is  very  low;  an  additional  process  requires 
about  15  words  of  core,  and  process  switching  takes  about  1  ms 
under  favorable  conditions.  There  are  numerous  other  examples 
of  the  value  of  multiple  processes;  most,  unfortunately,  are  too 
complex  to  be  briefly  explained. 

A  process  may  create  a  number  of  subsidiary  processes,  each 


of  which  is  independent  of  the  others  and  equivalent  to  them  from 
the  point  of  view  of  the  originating  process.  Figure  4  shows  two 
simple  multi-process  structures,  one  for  each  of  two  users.  .Note 
that  each  process  has  associated  with  it  pointers  to  its  controlling 
process  and  to  one  of  its  subsidiar\'  processes.  When  a  process  has 
two  immediate  descendants,  as  in  the  ca.se  of  processes  1.2  and 
1..3,  they  are  chained  together  on  a  ring.  Thus,  three  pointers,  up, 
down,  and  ring,  suffice  to  define  the  process  structure  completely. 
The  up  pointers  are,  of  course,  redundant,  but  are  convenient  for 
the  implementation.  The  process  is  identified  by  a  process  number 
which  is  returned  by  the  system  when  it  is  created. 

.\  complex  stnicture  such  as  that  in  Fig.  5  may  result  from  the 
creation  of  a  number  of  subsidiary  processes.  The  processes  in 
Fig.  5  have  been  numbered  arbitrarily  to  allow  a  clear  description 
of  the  way  in  which  the  pointers  are  arranged.  Note  that  the  user 
need  not  be  aware  of  these  pointers;  they  are  shown  here  to  clarify 
the  manner  in  which  the  nuiltiple  process  mechanism  is  imple- 
mented. 

.\  process  mav  destroy  one  of  its  subsidiary  processes  by  e.xecut- 
ing  the  appropriate  instniction.  For  obvious  rea.sons  this  operation 
is  not  legal  if  the  process  being  destroyed  itself  has  subsidiar)' 


1.1  2.1 
I  4  1  .2.8.6.0  9.0  \^  |lO  3.0.0.0.0,8  9  p 


4.1  ,  2.0,0.  3.5.0  |—j  4. 0.0,  6,6  .7.1.2  \^  1.3,4.0.6,5,8,9 


2  3  I  I_ 

I  1 .  3.4.0.0. 5.  8.0 


PMT  1 

PMT  2 

SMT 

1  M3 

1  SMT1 

1  Ml 

2  M4 

2  SMT5 

2  M16 

3  M5 

3  M7 

3  M2 

4  SMT1 

4  M8 

4  M10 

5  SMT4 

5  M9 

5  Mil 

6  SMT2 

6  SMT2 

6  MS 

7  M12 

7  M13 

8  SMT6 

8  SMT3 

9  SMT3 

9  M14 

10  0 

10  M15 

Fig.  4.  Process  and  memory  configuration  for  two  users.  (The  processes 
are  numbered  for  each  user  and  are  represented  by  their  process  map- 
ping registers.  Memory  blocks  are  identified  by  drum  addresses,  which 
are  written  Ml,  M2  ) 


296  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


Fig.  5.  Hierarchy  of  processes. 


processes.  It  is  possible  to  find  out  what  processes  are  subsidiary 
to  any  given  one;  this  permits  a  process  to  destroy  an  entire  tree 
of  sub-processes  by  reading  the  tree  from  the  top  down  and  de- 
stroying it  from  the  bottom  up. 

The  operations  of  creating  and  destroying  processes  are  entirely 
separate  from  those  of  starting  and  stopping  their  execution,  for 
which  two  more  operations  are  provided.  A  process  whose  execu- 
tion has  been  stopped  is  said  to  be  suspended. 

To  assure  that  these  various  processes  can  effectively  work 
together  on  a  common  task,  several  means  of  interprocess  com- 
munication exist.  The  first  allows  the  controlling  process  to  obtain 
the  current  status  of  each  of  its  subsidiary  processes.  This  status 
information,  which  is  read  into  a  table  by  the  execution  of  the 
appropriate  sy.stem  instruction,  includes  the  current  state  vector 
and  operating  status.  The  operating  status  of  any  process  may  be 

1  Running 

2  Dismissed  for  input-output 

3  Terminated  for  memory  violation 

4  Terminated  for  illegal  violation,  or 

5  Terminated  by  the  process  itself 


A  second  instruction  allows  the  controlling  process  to  become 
dormant  until  one  of  its  subsidiary  processes  terminates.  Termina- 
tion can  occur  in  the  following  four  ways: 

1  Because  of  a  memory  violation 

2  Because  of  an  illegal  instruction  violation 

3  Because  of  self-termination 

Interactions  described  previously  provide  no  method  by  which 
a  process  can  attract  the  attention  of  another  process  which  is 
pursuing  an  independent  course.  This  can  be  done  with  a  program 
interrupt.  Associated  with  each  process  is  a  20-bit  interrupt  mask. 
If  a  mask  bit  is  set,  the  process  may,  under  certain  conditions  (to 
be  described  in  the  following),  be  interrupted;  i.e.,  a  transfer  to 
a  fixed  address  will  be  simulated.  The  program  will  presumably 
have  at  this  fixed  address  the  location  of  a  subroutine  capable  of 
dealing  with  the  interrupt  and  returning  to  the  interrupted  com- 
putation afterwards.  The  mechanism  is  functionally  almost  identi- 
cal to  many  hardware  interrupt  systems. 

A  process  may  cause  an  interrupt  by  delivering  the  number 
of  the  interrupt  to  the  appropriate  instruction.  The  process  causing 
the  interrupt  continues  undisturbed,  but  the  nearest  process  which 
is  either  on  the  same  level  as  the  one  causing  the  interrupt  or 
above  it  in  the  hierarchy  of  processes,  and  which  has  the  appro- 
priate interrupt  armed,  will  be  interrupted.  This  mechanism  pro- 
vides a  very  flexible  way  for  processes  to  interact  with  each  other 
without  wasting  any  time  in  the  testing  of  flags  or  similar  frivolous 
activities. 

Interrupts  may  be  caused  not  only  by  the  explicit  action  of 
processes,  but  also  by  the  occurrence  of  several  special  conditions. 
The  occurrence  of  a  memory  violation,  attempted  execution  of 
an  illegal  instruction,  an  unusual  input-output  condition,  the  ter- 
mination of  a  subsidiary  process,  or  the  intervention  of  a  user  at 
a  console  (by  pushing  a  reserved  button)  all  may  cause  unique 
interrupts  (if  they  have  been  previously  armed).  In  this  way,  a 
process  may  be  notified  conveniently  of  any  unusual  conditions 
associated  with  other  processes,  the  process  itself,  or  a  console  user. 

The  memory  assignment  algorithm  di<;eussed  previously  is 
slightly  modified  in  the  presence  of  multiple  processes.  When  a 
process  is  activated,  one  of  three  options  may  be  specified; 

1  Assign  new  memory  to  the  process  entirely  independently 
of  the  controlling  process. 

2  Assign  no  new  memory  to  the  process.  Any  attempt  to 
obtain  new  memory  will  cause  a  memory  violation. 


Chapter  24  j  A  user  machine  In  a  time  sharing  system  297 


3  If  the  process  attempts  to  obtain  new  memory,  scan  upward 
through  the  process  hierarchy  until  the  topmost  process  is 
reached.  If  at  any  time  during  this  scan  a  process  is  found 
for  which  the  address  causing  the  trap  is  legal,  propagate 
the  memory  assigned  to  it  down  through  the  hierarchy  to 
the  process  causing  the  trap. 

Option  .3  permits  a  process  to  be  started  with  a  subset  of 
memory  and  later  to  reacquire  some  of  the  memory  which  was 
not  given  to  it  initially.  This  feature  is  important  because  the 
amount  of  memory  assigned  to  a  process  influences  the  operating 
efficiency  of  the  system  and  thus  the  speed  with  which  it  will  be 
able  to  respond  to  teletypes  or  other  real-time  devices. 

The  input-output  system 

The  user  machine  has  a  straightforward  but  unconventional  set 
of  input-output  instructions.  The  primary  emphasis  in  the  design 
of  these  instructions  has  been  to  make  all  input-output  devices 
interface  identically  with  a  program  and  to  provide  as  much 
flexibility  in  this  common  interface  as  possible.  Two  advantages 
result  from  this  uniformity;  it  becomes  natural  to  write  programs 
which  are  essentially  independent  of  the  environment  in  which 
thev  operate,  and  the  implementation  of  the  system  is  greatly 
simplified.  To  the  user  the  former  point  is.  of  course,  the  important 
one. 

It  has  been  common,  for  e.xaniple,  for  programs  written  to  be 
controlled  from  a  teletype  to  be  driven  instead  from  a  file  on,  let 
us  say,  the  drum.  A  command  exists  which  permits  the  recognizer 
for  the  system  command  language  and  all  of  the  subsystems  to 
be  driven  in  this  way.  This  device  is  particularly  useful  for  repeti- 
tive sequences  of  program  assemblies  and  for  background  jobs 
which  are  run  in  the  absence  of  the  user.  Output  which  normally 
goes  to  the  teletype  is  similarly  diverted  to  user  files.  Another 
application  of  the  uniformity  of  the  file  system  is  demonstrated 
in  some  of  the  subsystems,  notably  the  assembler  and  the  various 
compilers.  The  subsystem  may  request  the  user  to  specify  where 
he  wishes  the  program  listing  to  be  placed.  The  user  may  choose 
anything  from  paper  tape  to  drum  to  his  own  teletype.  In  the 
absence  of  file  uniformity  each  subsystem  would  require  a  separate 
block  of  code  for  each  possibility.  In  fact,  however,  the  same 
input-output  instnictions  are  used  for  all  cases. 

The  input-output  instructions  communicate  with  files.  The 
system  in  turn  associates  files  with  the  various  physical  devices. 
Programs,  for  the  most  part,  do  not  have  to  account  for  the  pecu- 
liarities of  the  various  actual  devices.  Since  devices  differ  widely 


in  characteristics  and  behavior,  the  flexibility  of  the  operations 
available  on  files  is  clearly  critical.  They  must  range  from  single- 
character  input  to  the  output  of  thousands  of  words. 

.\  file  is  opened  by  giving  its  name  as  an  argimient  to  the 
appropriate  instniction.  Programs  thus  refer  to  all  files  symboli- 
cally, leaving  the  details  of  physical  location  and  organization  to 
the  sy,stem.  If  authorized,  a  program  may  refer  to  files  belonging 
to  other  users  by  supplying  the  name  of  the  other  user  as  well 
as  the  file  name.  The  owner  of  a  file  determines  who  is  authorized 
to  access  it.  The  reader  may  c-ompare  this  file  naming  mechanism 
with  a  more  sophisticated  one  [Dalev  and  .Neumann,  196.5],  bearing 
in  mind  the  fact  the  file  names  can  be  of  any  length  and  can  be 
manipulated  (as  strings  of  characters)  by  the  program. 

Access  to  files  is,  in  general,  either  sequential  or  random  in 
nature.  Some  devices  (like  a  keyboard-display  or  a  card  reader) 
are  purely  sequential,  while  others  (like  a  disk)  may  be  either 
sequentially  or  randomly  acces.sed.  There  are  accordingly  two 
major  I  O  interfaces  to  deal  with  these  different  qualities.  The 
interface  used  in  conjunction  with  a  given  file  depends  on  whether 
the  file  was  declared  to  be  a  random  or  a  seqiwnlitd  file.  The  two 
major  interfaces  are  each  broken  do«  n  into  other  interfaces,  pri- 
marily for  reasons  of  implementation.  .-Mthough  the  distinction 
between  sequential  and  random  files  is  great,  the  subinterfaces  are 
not  especially  visible  to  the  user. 

Sequential  files 

The  three  instructions  CIO  (character  input-output),  WTO  (word 
input-output),  and  BIO  (block  input-output)  are  used  to  commu- 
nicate with  a  sequential  file.  Each  instniction  takes  as  an  operand 
a  file  number.  This  number  is  given  to  the  program  when  it  opens 
a  file.  At  the  time  of  opening  a  file  it  must  be  specified  whether 
the  file  is  to  be  read  from  or  written  onto.  Whether  any  given 
device  associated  with  the  file  is  character-oriented  or  word- 
oriented  is  unimportant;  the  system  takes  care  of  all  necessary 
character-to-word  assembly  or  word-to-character  disassembly. 

There  are  actually  three  separate,  fidl-duplex  physical  inter- 
faces to  devices  in  the  sequential  file  mechanism.  Generally,  these 
interfaces  are  invisible  to  programs.  They  exist,  of  course,  for 
reasons  of  system  efficiency  and  also,  because  of  the  way  in  which 
some  devices  are  used.  The  interfaces  are: 

1  Character-bv-character  (basically  for  low-speed,  character- 
oriented  devices  used  for  man-machine  interaction) 

2  Buffered  block  I/O  (for  medium-speed  I/O  applications) 
■3    Block  I  O  directh'  from  user  core  (for  high-speed  situations) 


298  Part  3  |  The  instruction-set  processor  level:  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


It  should  be  pointed  out  that  there  is  no  particular  relation  be- 
tween these  interfaces  and  the  three  instructions  CIO,  WIO,  and 
BIO.  The  interface  used  in  a  given  situation  is  a  function  of  the 
device  involved  and,  sometimes,  of  the  volume  of  data  to  be  trans- 
mitted, not  of  the  instruction. 

Any  interface  may  be  driven  by  any  instruction. 

Of  the  three  subinterfaces  under  discussion,  the  last  two  are 
straightforward.  The  character-by-character  interface  is,  however, 
somewhat  different  and  deserves  some  elaboration.  Devices  associ- 
ated with  this  interface  are  generally  (but  not  necessarily)  used 
for  man-machine  interaction.  Consider  the  case  of  a  person  com- 
mimicating  with  a  program  by  means  of  a  keyboard-display  (or 
a  teletype).  He  types  on  the  keyboard  and  the  information  is 
transmitted  to  the  computer.  The  program  may  wish  to  make  an 
immediate  response  on  the  display  screen.  In  many  cases  this 
response  will  consist  of  an  echo  of  the  same  character,  so  that  the 
user  has  the  feeling  of  typing  directly  onto  the  screen  (or  onto 
the  teleprinter). 

So  that  input-output  can  be  carried  out  when  the  program  is 
not  actually  in  main  memory,  the  character-by-character  input 
interface  permits  programs  a  choice  of  a  number  of  echo  tables; 
it  further  permits  programs  a  choice  of  grade  of  service  by  per- 
mitting them  to  specify  whether  a  given  character  is  an  attention 
(or  break)  character.  Thus,  for  example,  the  program  may  specify 
that  each  character  typed  is  to  be  echoed  immediately  and  that 
all  control  characters  are  to  result  in  activation  of  the  program 
regardless  of  the  number  of  characters  in  the  input  buffer.  Alter- 
natively, the  program  may  specify  that  no  characters  are  echoed 
and  every  character  is  a  break  character.  By  changing  the  specifi- 
cation the  program  can  obtain  an  appropriate  (and  varying)  grade 
of  service  without  putting  undue  load  on  the  system.  Figure  6 


I  Output  buffer  I 


Output  interrupt 
routine 


[Input  buffer  I 


Fig.  6.  The  character-oriented  Interface. 


shows  the  components  of  the  character-by-character  interface; 
responsibility  for  its  operation  is  split  between  the  interrupt  called 
when  the  device  signals  for  attention  and  the  routine  which  proc- 
esses the  user's  I/O  request. 

The  advantage  of  the  full-duplex,  character-by-character  mode 
of  operation  is  considerable.  The  character-by-character  capability 
means  that  the  user  can  interact  with  his  program  in  the  smallest 
possible  unit — the  character.  Furthermore,  the  full-duplex  capa- 
bility permits,  among  other  things  (1)  the  program  to  substitute 
characters  on  strings  of  characters  as  echoes  for  those  received, 
(2)  the  keyboard  and  display  to  be  used  simultaneously  (as,  for 
example,  permitting  a  character  typed  on  a  keyboard  to  pre-empt 
the  operation  of  a  process.  In  the  case  of  typing  information  in 
during  the  output  of  information,  a  simple  algorithm  prevents  the 
random  admixture  of  characters  which  might  otherwise  result), 
and  (3)  the  ready  detection  of  transmission  errors. 

Instructions  are  included  to  enable  the  state  of  both  input  and 
output  buffers  to  be  sensed  and  perhaps  cleared  (discarding  un- 
wanted output  or  input).  Of  course,  it  is  possible  for  a  program 
to  use  any  number  of  authorized  physical  devices;  in  particular, 
this  includes  those  devices  used  as  remote  consoles.  A  mechanism 
is  provided  to  permit  output  which  is  directed  to  a  given  device 
to  be  copied  on  all  other  devices  which  are  output  linked  to  it 
(and  similarly  for  input).  This  is  useful  when  communication 
among  users  is  desired  and  in  numerous  other  situations. 

The  sequential  file  has  a  stmcture  somewhat  similar  to  that 
of  an  ordinary  magtape  file.  It  consists  of  a  sequence  of  logical 
records  of  arbitrary  length  and  number.  On  some  devices,  such 
as  a  card  reader  or  the  teletype,  a  file  may  have  only  one  logical 
record.  The  full  generality  is  available  for  drum  files,  which  are 
the  ones  most  commonly  used.  The  logical  record  is  to  be  con- 
trasted with  the  variable  length  physical  record  of  magtape  or  the 
fixed  length  record  of  a  card.  Instructions  are  provided  to  insert 
or  delete  logical  records  and  increase  or  decrease  them  in  length. 
Other  instructions  permit  the  file  to  be  "positioned"  almost  in- 
stantaneously to  a  specified  logical  record.  This  gives  the  sequen- 
tial file  greater  flexibility  than  one  which  is  completely  unaddressa- 
ble.  This  flexibility  is  only  possible,  of  course,  because  the  file  is 
on  a  random-access  device  and  the  sequential  structure  is  main- 
tained by  pointers.  The  implementation  is  discussed  in  the  follow- 
ing. 

When  reading  a  sequential  file,  CIO  and  WIO  return  certain 
unusual  data  configurations  when  they  encounter  an  end  of  record 
or  end  of  file,  and  BIO  terminates  transmission  on  either  of  the 
conditions  and  returns  the  address  of  the  last  word  transmitted. 
In  addition,  certain  flag  bits  are  set  by  the  unusual  conditions, 
and  an  interrupt  may  be  caused  if  it  has  been  armed. 


Chapter  24  |  A  user  machine  in  a  time  sharing  system  299 


The  implementation  of  the  se(juential  file  scheme  for  auxiliary 
storage  is  illustrated  in  Fig.  7.  Information  is  written  on  the  dnim 
in  256-word  physical  records.  The  locations  of  these  records  are 
kept  track  of  in  64-word  index  blocks  containing  pointers  to  the 
data  blocks.  For  the  file  shown,  the  first  logical  record  is  more 
than  256  words  long  but  ends  in  the  second  256-word  block.  The 
second  logical  record  fits  in  the  third  2.56-word  block  and  the  third 
logical  record — in  the  4th  data  block — is  followed  by  an  end  of 
file.  If  a  file  requires  more  than  64  index  words,  additional  index 
blocks  are  chained  together,  both  forward  and  backward.  Thus, 
in  order  to  access  information  in  the  file  it  is  necessary  only  to 
know  the  location  of  the  first  index  block.  It  may  be  worthwhile 
to  point  out  that  all  users  share  the  same  drum.  Since  the  system 
has  complete  control  over  the  allocation  of  space  on  the  drum, 
there  is  no  possibility  of  undesired  interaction  among  users. 

.\\  ailable  space  for  new  data  blocks  or  index  blocks  is  kept  track 
of  b\  a  bit  table,  illustrated  in  Fig.  8.  In  the  figure,  each  column 
represents  one  of  the  72  physical  bands  on  the  drum  allocated  for 
the  storage  of  file  information.  Each  row  represents  one  of  the 
64  256-word  sectors  around  a  band.  Each  bit  in  the  table  thus 
represents  one  of  the  4608  data  blocks  available.  The  bits  are  set 
when  a  block  is  in  use  and  cleared  when  the  block  becomes  avail- 
able. Thus,  if  a  new  data  block  is  required,  the  system  has  only 
to  read  the  physical  position  of  the  dnmi,  use  this  position  to  inde.x 
in  the  table,  and  search  a  row  for  the  appearance  of  a  0.  The 
colimin  in  which  a  0  is  found  indicates  the  physical  track  on  which 
a  block  is  available.  Because  of  the  wa\  the  row  was  chosen,  this 
block  is  immediately  accessible.  This  scheme  has  two  advantages 
over  its  alternative,  which  is  to  chain  unused  blocks  together: 

1    It  is  easy  to  find  a  block  in  an  optimum  position,  using  the 
algorithm  just  described. 


EOR/  EOF 


11 


Fig.  7.  Index  blocks  and  pointers  to  data  blocks. 


72  bits 

64  words 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

\ 

1 

\ 

{ 

1 

1 

t 

1 

\ 

\ 

1 

( 

I 

\ 

1 

1 

1 

I 

1 

t 

-A 

\ 

1 

^ 

) 

1 

t 

i 

I 

Fig.  8.  Bit  table  for  allocation  of  space  on  the  drum. 


2    No  drum  operations  are  required  when  a  new  block  is 
needed  or  an  old  one  is  to  be  released. 

It  may  be  preferable  to  assign  the  new  block  so  that  it  becomes 
accessible  immediately  after  the  block  \ast  assigned  for  the  file. 
This  scheme  will  speed  up  subsequent  reading  of  the  file. 

Random  files 

Au.viliarv  storage  files  can  also  be  treated  as  extensions  of  core 
memory  rather  than  as  sequential  devices.  Such  files  are  called 
random  file^.  .\  random  file  differs  from  a  sequential  file  in  that 
there  is  no  logical  record  structure  to  the  file  and  that  information 
is  extracted  from  or  written  into  the  random  file  by  addressing 
a  s(iecific  word  or  block  of  words.  It  may  be  opened  like  a  sequen- 
tial file;  the  only  difference  is  that  it  need  not  be  specified  as  an 
output  or  an  input  file. 

Four  instructions  are  used  to  input  and  output  words  and  blocks 
of  words  on  a  random  file.  To  permit  the  random  file  to  look  even 
more  like  core  memory,  an  instruction  enables  one  of  the  currently 
open  random  files  to  be  specified  as  the  secondary  memory  file. 
Two  instructions,  LAS  (load  A  from  secondary  memory)  and  S.\S 
(Store  A  in  secondary  memory),  act  like  ordinary  load  and  store 
instructions  with  one  level  of  indirect  addressing  (see  Fig.  9)  ex- 
cept, of  course,  that  the  data  are  in  a  random  file  instead  of  in 
core  memory . 

Random  files  are  implemented  like  sequential  files  except  that 
end  of  record  indicators  are  not  meaningful,  .\lthough  as  many 
index  blocks  are  used  up  as  required  by  the  size  of  a  random  file, 
only  those  data  blocks  which  actually  contain  information  w  ill  be 
attached  to  a  random  file.  As  new  locations  are  accessed,  new  data 
blocks  are  attached. 

Suhroutine  files 

WheieAS  it  makes  little  sense  to  associate,  sav.  a  card  reader  with 
a  random  file,  a  sequential  file  can  be  associated  with  any  physi- 


Part  3  j  The  instruction-set  processor  level;  variations  in  the  processor 


Section  6  |  Processors  with  multiprogramming  ability 


Mam  memory 

Secondary  memory 

LDA*  ADDR 

LAS  ADDR 

STA-*  ADDR 

SAS  ADDR 

(o) 

Address 

Instruction 

600 

LAS  1450 

1450 

16346 

16345 

1234567 

Effect- 
ib 

1234567— A 

Fig.  9.  Load  and  store  form  main  and  secondary  memory,  (a)  Instruc- 
tions, (b)  Addressing. 

cal  device  in  the  system.  In  addition,  a  sequential  file  may  be 
associated  with  a  subroutine.  Such  a  file  is  called  a  subroutine  file. 
and  the  subroutine  may  thus  be  thought  of  as  a  "nonphysical" 
device,  The  subroutine  file  is  defined  by  the  address  of  a  subroutine 
together  with  information  indicating  whether  it  is  an  input  or  an 
output  file  and  whether  it  is  word  or  character  oriented.  An  input 
operation  from  a  subroutine  file  causes  the  subroutine  to  be  called. 
When  it  returns,  the  contents  of  the  A  register  is  taken  to  be  the 
input  requested.  Correspondingly,  an  output  operation  causes  the 
subroutine  to  be  called  with  the  word  or  character  being  output 
in  A.  The  subroutine  is  completely  unrestricted  in  the  kinds  of 
processing  it  can  do.  It  may  do  further  input  or  output  and  any 
amount  of  computation.  It  may  even  call  itself  if  it  preserves  the 
old  return  address. 

Recall  that  for  sequential  flies  the  system  transforms  all  infor- 
mation supplied  by  the  user  to  the  format  required  by  the  particu- 


lar file;  hence,  the  requirement  that  the  user,  in  opening  a  sub- 
routine file,  must  specify  whether  the  file  is  to  be  character  or 
word  oriented.  The  system  will  thereafter  do  all  the  necessary 
packing  and  unpacking. 

Subroutine  files  are  the  logical  end-product  of  a  desire  to  de- 
couple a  program  from  its  environment.  Since  they  can  do  arbi- 
trary computations,  they  can  provide  buffers  of  any  desired  com- 
plexity between  the  assumptions  a  program  has  made  about  its 
environment  and  the  true  state  of  things.  In  fact,  they  make  it 
logically  unnecessary  to  provide  an  identical  interface  for  all  the 
input-output  devices  attached  to  the  system;  if  uniformity  did  not 
exist,  it  could  be  simulated  with  the  appropriate  subroutine  files. 
Considerations  of  convenience  and  efficiency,  of  course,  militate 
against  such  an  arrangement,  but  it  suggests  the  power  inherent 
in  the  subroutine  file  machinery. 

Summary 

The  user  machine  described  was  designed  to  be  a  flexible  founda- 
tion for  development  and  experimentation  in  man-machine  sys- 
tems. The  user  has  been  given  the  capability  to  establish  configura- 
tions of  multiple  processes,  and  the  processes  have  the  ability  to 
communicate  conveniently  with  each  other,  with  central  flies,  and 
with  peripheral  devices.  A  given  user  may,  of  course,  wish  only 
to  use  a  subsystem  of  the  general  system  (e.g.,  a  compiler  or  a 
debugging  routine)  for  his  particular  job.  In  the  course  of  using 
the  subsystem,  however,  he  may  become  dissatisfled  with  it  and 
wish  to  revise  or  even  rewrite  the  subsystem.  The  features  of  the 
user  machine  not  only  permit  this  activity  but  make  it  easier. 

References 

BrigH64;  ComfW65;  ConwM63;  CorbF65;  DaleR65:  DennJ66;  ForgJ65; 
LanipB6.5;  LichW65;  McCaJ63;  McCuJ65;  SaltJ66;  SchwJ64 


Part  4 


The  instruction-set  processor  level: 
special-function  processors 

This  part  contains  descriptions  of  processors  that  do  not  interpret  general  pro- 
gramming languages;  that  is,  they  are  not  Pc's.  They  are  all  P's,  however,  since 
they  have  an  interpreter  that  determines  not  only  the  operations  to  be  taken,  given 
the  current  instruction,  but  the  next  instruction  to  be  obtained. 

A  Pio  (Sec.  1)  is  a  processor  that  controls  T  and  Ms  components.  It  manages 
block  or  vector  transmission  between  Ms  or  T  and  Mp. 

A  P. array  (Sec.  2)  processes  both  vectors  and  two-dimensional  matrices.  By 
recognizing  these  data  as  fundamental  units,  programs  (or  algorithms)  can  be 
expressed  efficiently  in  terms  of  primitive  operators.  The  chief  advantage  of  these 
P's  is  their  ability  to  take  advantage  of  the  data  structure  for  parallel  interpretation, 
thereby  increasing  processing  speed. 

A  microprogram  processor  (Sec.  3)  is  designed  to  interpret  and  process  a  data- 
type which  IS  a  program.  In  effect,  this  processor  is  a  computer  withm  another 
computer,  programmed  to  act  as  an  interpreter. 

A  language  processor  (Sec.  4)  interprets  a  data  type  derived  from  the  primitives 
of  a  programming  language.  In  contrast,  a  conventional  processor  interprets  a 
language  based  on  fundamental  hardware  implementation  primitives.  The  difference 
is  clearly  apparent  as  increased  complexity  of  the  language  processors. 


301 


I 


Section  1 


Processors  to  control  terminals 
and  secondary  memories 
(input-output  processors) 

The  first  three  chapters  of  this  section  show  the  evolution  of 
the  IBM  Data  Channels  (io  processors)  from  1958  (the  7094 
II)  to  the  present  (the  1800,  which  came  after  the  360).  The 
processor  approach  for  controlling  T  and  Ms  components,  while 
more  general,  should  be  contrasted  with  the  specialized  one- 
instruction  controls  in  the  B  5000  (Chap.  22)  and  Burroughs 
D825  (Chap.  36). 

The  fourth  chapter,  on  the  DEC  338,  shows  a  processor  that 
controls  cathode-ray-tube  display  consoles.  The  graphic  termi- 
nals are  the  first  T's  of  sufficient  complexity  to  utilize  a  proc- 
essor of  their  own.  The  first  CRT  displays  used  the  Pc  (e.g., 
on  Whirlwind):  then  small  Pc's  were  adapted  to  the  task;  the 
DEC  338  is  one  of  the  earliest  special  P. display's  that  ap- 
peared. 

There  is  no  example  in  this  section  of  a  specialized  P  for 
message  concentration  and  switching.  For  computer  systems 
multiple  remote  inputs  are  still  recent  enough  so  that  either 
the  main  Pc  handles  the  task,  via  specialized  K,  or  small  Pc's 
are  committed  to  it.  However,  in  the  telephone  industry  there 
has  been  a  very  substantial  development  by  the  Bell  System 
of  the  Electronic  Switching  System  (ESS),  which  uses  specialized 
C's  to  control  switching  (routing).  In  computer  systems,  we  can 
expect  the  use  of  such  specialized  processors  to  increase  in 
the  near  future. 


The  IBM  7094  II 

The  IBM  709,  a  member  of  the  IBM  701-7094  II  family,  is  one 
of  the  first  computers  to  have  an  io  processor  (IBM  name:  Data 
Channel)  in  its  structure.  Chapter  41  discusses  the  two  Data 
Channel  types:  the  early  7607  and  the  later  7909.  The  7909 
Data  Channel  ISP,  and  a  K  which  it  controls,  are  given  in  Ap- 
pendix 2  and  3  of  Chap.  41.  The  principal  difference  is  that 
Pc  controls  the  Pio  ('7909)  which  in  turn  controls  the  K,  which 
in  turn  controls  a  T  or  Ms;  the  Pc  controls  the  Pio  (7607)  and 
the  K;  the  K  controls  the  T  or  Ms.  The  series  is  discussed  in 
Part  6,  Sec.  1,  page  515. 


The  structure  of  System /360 

Part  I— outline  of  the  logical  structure 

The  10  processors  (Selector  and  Multiplexor  Channels)  in  the 
System/360  have  evolved  from  the  IBM  701-7094  II  Series.  Part 
6,  Sec.  3  presents  the  ISP  and  PMS  structures  for  these  proc- 
essors. Depending  on  the  computer  model,  the  implementations 
are  realized  by  a  microprogrammed  processor  interpreting  a 
shared  control  program  for  both  Pio's  and  Pc,  or  by  a  hardwired 
Pio.  The  multiple  Pio's  in  a  360  Multiplexor  Channel,  though 
logically  independent,  are  implemented  as  a  single,  shared 
physical  processor. 

The  IBM  1800 

The  Pio's  in  this  structure  are  presented  in  Chap.  33,  and  the 
structure  is  discussed  in  Part  5,  Sec.  2,  page  396. 

The  Digital  Equipment  Corporation  DEC  338  display  processor 

The  DEC  338  is  an  early  P. display.  It  directly  interprets  a  stored 
program  to  control  a  T. display.  Earlier  T. displays  were  con- 
trolled by  Pc  (Whirlwind,  Chap.  6),  or  by  a  special  K. display 
without  stored-program  capability,  or  by  a  general-purpose  Pio. 
The  last  method  outputs  fixed  length  blocks  containing  data  to 
be  interpreted  by  T. display  as  points,  vectors,  characters, 
curved  line  segments,  etc.  The  control  of  T. display  first  by  Pc, 
then  by  a  K,  then  by  a  Pio,  and  finally  by  a  P. display  has  been 
observed  as  an  evolution  [Myer  and  Sutherland,  1968].  Myer 
and  Sutherland  also  observe  that  the  evolution  is  about  to 
become  a  closed  cycle  because  the  generality  of  a  Pc  is  needed 
to  control  a  T. display. 

Note  that  the  338  has  a  very  extensive  ISP.  In  fact,  the 
P. display's  ISP  is  more  extensive  than  the  companion  Pc  of  the 
PDP-8  (Chap.  5).  There  are  some  display  tasks  which  require 
Pc,  for  example,  compiling  programs  (pictures),  calculating 
elaborate  light-pen  tracking  figures,  making  coordinate  and 
curved  lines  to  straight-line  vector  approximation  transforma- 
tions, and  communicating  with  other  system  components. 


303 


304  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  1  |  Processors  to  control  terminals  and  secondary  memories 


Another  approach  to  the  design  of  a  P.display  is  based  on 
a  P. microprogram  which  is  shared  among  many  T. displays 
[Rose,  1967].  Yet  another  alternative,  which  has  not  yet  been 
tried,  is  to  incorporate  a  Rio  (P.display)  as  a  special  mode  in 
a  conventional  Pc.  Thus  the  R  would  interpret  either  conven- 
tional Pc  instructions  or  P.display  instructions. 

P.display  is  the  interpreter  for  the  output  of  pictures  or 
graphics.  The  338  utilizes  data  space  efficiently  simply  because 
the  data  are  long  variable-length  strings  (word  vectors).  The 
instruction  requires  almost  no  space  to  specify  the  data  opera- 
tions and  addresses;  data  are  interpreted  directly  or  immedi- 
ately in  the  instruction  rather  than  via  instruction  addresses. 

Another  feature  which  allows  a  program  to  be  efficiently 
encoded  is  the  stack  mechanism  for  storing  subroutine  link- 
ages. Subroutines  in  P. displays  are  actually  programs  which 
form  part  of  a  more  complete  picture.  Subroutines  are  actually 
subpictures.  Although  the  stack  mechanism  allows  for  recursive 
picture  calls,  the  stack  is  used  principally  to  save  space  and 
to  allow  multiple  T. displays  to  use  common  picture  programs. 

A  problem  in  the  338  which  is  common  to  all  multi-P  struc- 
tures is  intercommunication  among  the  P's.  Pc  is  the  control- 
ling P,  as  is  the  case  with  most  Pc-Pio  structures.  The  P('338) 
has  no  trap  to  itself  but  relies  on  an  interrupt  signal  to  Pc.  The 
Pc  processes  both  tasks  which  P.display  might  process,  given 


an  interrupt  system,  and  other  tasks  beyond  P. display's  capa- 
bility. 

A  clock  should  be  built  into  the  338.  The  brightness  or  in- 
tensity of  a  picture  is  determined  both  electronically  (see  the 
mode  instructions  for  controlling  intensity)  and  by  the  rate  at 
which  the  pictures  are  repeated.  A  clock  would  allow  the  time 
when  pictures  are  started  or  drawn  to  be  specified;  thus  the 
intensity  would  be  independent  of  picture  length. 

The  338  requires  more  hardware  than  a  simpler  Pc.  However, 
a  large  amount  of  this  hardware  is  used  to  control  the  genera- 
tion of  characters  and  lines.  The  lines  (vectors)  are  drawn 
using  a  DDS  (Digital  Differential  Analyzer)  technique.  Perhaps 
one-half  of  the  registers  could  be  eliminated  if  the  338  were 
not  a  P.  A  simpler  alternative  was  constructed  about  a  similar 
computer,  the  PDP-9,  by  Bell  Telephone  Laboratories  and  DEC, 
using  the  approach  of  making  the  display  only  a  K. 

A  more  elaborate  Pc  interrupt  system  with  reduced  overhead 
time  would  enable  Pc  to  take  on  the  specialized  program  control 
functions  in  the  338.  Such  a  scheme  might  pass  the  program 
or  instruction  counter  parameter  directly  from  P.display  to  Pc. 
In  this  way,  Pc  or  P.display  would  alternatively  process  part  of 
a  single  instruction  stream,  depending  on  the  task. 

Despite  the  problems  of  this  early  P.display,  it  has  a  sophis- 
tication which  successors  appear  to  be  following. 


Chapter  25 

The  DEC  338  display  computer 


Introduction 

The  C(display;  'DEC  338)  is  a  C('DEC  PDP-8)  with  a  P.disphu 
which  can  connect  to  T(#l:8;  CRT;  display;  area:  9.375  X  9.375 
in.-).  The  PMS  stnicture  is  shown  in  Fig.  1,  Chap.  5,  describing 
the  PDP-8.  The  Pc  ISP  is  given  in  Appendix  1  of  Chap.  5. 

The  C(  '.338),  although  designed  to  stand  alone,  is  generalh  used 
as  a  satellite  to  a  larger  C,  via  an  L(Dataphone).  The  rationale 
for  using  a  C  as  a  T  is  based  on  the  bandwidth  and  storage  require- 
ments needed  to  maintain  graphical  picture  displays.  human 
being  manipulating  pictures  (rotation,  scale  change,  and  conver- 
sion of  internal  linked  data  structure  to  a  picture  structure)  re- 
quires short  response  time;  this  requirement  places  high  processing 
demands  on  larger  C's.  Thus  this  C(display)  is  a  preprocessor  for 
larger,  more  general  C's. 

The  actual  T(CRT)  is  a  16-inch  CRT  with  a  9%-inch  square 
viewing  area  covered  bv  1,024  X  1,024  (XY)  points.  The  diameter 
of  the  points  is  -~-0.015  inch.  The  spot  is  magnetically  deflected 
and  focused.  All  eight  T(CRT)'s  can  be  driven  together  or  used 


Fig.  1.  DEC  338  Instruction-interpretation  state  diagram. 


independently.  \  photomultiplier  connected  through  a  fiber-optic 
bundle  link  is  used  as  a  light  pen  (a  photosensitive  sensor)  to  detect 
spots  on  the  T.  The  light  pen  allows  the  P. display  to  detect 
whether  a  user  has  "pointed  to  "  a  displaved  spot. 

Pc  and  P.display  access  the  same  .\Ip;  the  total  data  rate  avail- 
able from  Mp  is  one  12-bit  word/'1.5  microseconds.  The  instniction 
times  of  P.display  are  a  function  of  the  point  plotting  times  of 
the  T(CRT):0.3  microsecond  to  the  next  incremental  unintensified 
point  (approximately  0.010  inch  away);  1.2  microseconds  to  an 
incremental  intensified  point;  and  3.5  microseconds  to  a  point 
plotted  at  a  random  position. 

The  state  (registers)  of  C. display  is  given  in  the  ISP  description 
of  .\ppendix  1  of  this  chapter.  There  are  four  parts  of  the  state: 
the  control  registers  for  Program  Flow  State,  the  Picture  State 
(or  position  of  beam).  Console  and  Light-pen  State,  and  Mp  State. 
The  instruction  interpreter  is  fairly  simple  and  is  best  described 
In  the  state  diagram  (Fig.  1).  The  instnictions  are  given  in  Tables 
1  and  2.  The  remainder  of  the  chapter  discusses  the  P.displav 
instructions  and  the  Pc  instructions  for  communicating  with  P.dis- 
play. 

Principle  of  operation 

The  actual  picture  is  held  stationary  by  repeatedly  displaying 
(intensifying)  a  particular  point,  line.  etc.  The  number  of  times 
a  figure  has  to  be  displayed  so  that  it  appears  stationary'  and  does 
not  flicker  depends  on  the  CRT  phosphor,  the  figure,  and  environ- 
mental parameters.  The  generally  accepted  range  is  a  plotting  rate 
of  20  50  plots/second;  thus  a  complete  picture  has  to  be  drawn 
in  50  20  milliseconds.  If  we  assume  a  30-Hz  plot  rate,  about 
28,000  points  can  be  plotted  in  vector  mode  ( or  280  ~-  1 120  inches, 
depending  on  the  spacing),  .\bout  1,000  characters  can  be  dis- 
plaved in  30  milliseconds  using  character  mode. 

When  the  light  pen  is  used,  a  display  program  is  required  to 
"track"  the  pen.  The  pen's  position  is  determined  by  displaying 
know  n  points.  The  pen,  of  course,  detects  the  points  when  it  is 
present  at  the  displayed  points  position;  therefore  the  program 
knows  the  location  of  the  pen. 

The  parameters  of  interest  for  a  displav  vary-,  depending  on 
the  application.  However,  the  general  parameters  are: 


306  Part  4  j  The  instruction-set  processor  level:  special-function  processors 


Section  1  I  Processors  to  control  terminals  and  secondary  memories 


Table  1    DEC  338  control-mode  Instruction  set 


Instruction  Op  Code 
Bits  0:2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

Parameters 

0 

sett  Scale 

Scale  <0:1> 

set  It  pen 

It  pen 

set 

Intensity 

Intensity  <0:2> 

Mode 

1 

stop 

clear 
flags 

set  mode 

Data„Mode 

<0:2> 

clear 
sector 

clear 
X,  Y 

enter 

Data_State 

Jumpt 

2 

set  Scale 

Scale  (0:1> 

set  It  pen 

It  pen 

push 

Memory  field  <0:2> 

Pop 

3 

set  Scale 

Scale  <0:1> 

set  It  pen 

It  pen 

inh§ 

Data^Mode 

inh  Scale, 
It  pen 

inh 

intensity 

enter 

Data^State 

Conditional 
skip 

4 

reverseH 
test 

clear  bits 
after  test 

complement 
after  test 

Push_Buttons  <0:5>/PB  <0;5> 

Conditional 
skip 

5 

reverse^ 
test 

clear  bits 
after  test 

complement 
after  test 

Push„Buttons  <6:11>/PB  (6:1 1> 

Arithmetic 
compare  PB 

6 

0 

0 

0 

Push_Buttons  <0:5> 

Arithmetic 
compare  PB 

6 

0 

0 

1 

Push^Buttons  <6:11> 

Skip  on  flags 

6 

0 

1 

0 

skip 

skip  if  not 
in  sector 

skip  if 

PB  <0:5>  =  0 

skip  if 

PB  <5:11>  =  0 

Count 

6 

0 

1 

1 

count 
scale 

0^+1 
1^-1 

count 
Intensity 

0^+1 
0^-1 

Set  slaves 

6 

1 

Group  number  <0:1> 

set  unit  0 

It  pen 

Intensity 

set  unit  1 

It  pen 

Intensity 

Spare 

7 

t  Set;  allow  instruction  bits  to  specify  new  value. 

I A  two-word  instruction,  second  word  contains  low-order  12  bits  for  DAC  (jump  address). 
H  Skip  can  be  for  true  or  false. 
§  Inhibit  restoration  of  bits. 


h    Plotting  time 

.3    Transformation  and  internal  representations 
a    Space  to  encode  (specify)  a  figure 
b    Scale  change,  rotation,  coordinate-system  transformation 
abilities 

c    Ability  to  communicate  between  a  displayed  data 
structure  and  an  internal  representation  of  a  picture 

4    Light-pen  or  graphic  input  capability 
line  segments,  etc. 


1  Picture 
a  Display  area 

b  Phosphor  type  (intensity  and  color  as  function  of  time) 

c  Spot  size 

d  Resolution 

e  Linearity 

/  Short-term  and  long-term  stability 

2  Figure  plotting  (generation)  characteristics 
a    Data  types:  points,  lines  (vectors),  graphs,  characters 

(from  a  fixed  set),  characters  (from  a  defined  set),  curved- 


Chapter  25  {  The  DEC  338  display  computer  307 


Instructions  and  their  interpretation  in  P(display) 

Two  instruction-set  types  are  interpreted  in  the  P. display:  Data 
State,  in  which  instructions  specify  display  information;  and  Con- 
trol State,  in  which  instructions  specify  program  control  informa- 
tion (e.g.,  jumps,  modes,  etc.).  A  state  diagram  for  the  interpre- 
tation process  is  given  in  Fig.  1. 

Data-state  instructions 

There  are  seven  instructions  (which  DEC  calls  modes)  that  can 
be  executed  while  P.displav  is  in  data  state.  The  instructions 
(modes)  are  reallv  substates  of  data  state.  The  instructions  (actually 


more  like  data)  are  interpreted  for  the  mode.  When  all  the  data- 
mode  instructions  have  been  interpreted,  an  escape  instruction 
returns  the  P.display  to  control  state.  A  control  instniction  is  issued 
to  select  a  mode  and  simultaneously  place  the  display  in  data  state. 

Increment  mode.  This  mode  is  used  to  draw  curves  and  alpha- 
lumieric  characters  and  other  small  .symbols.  Two  instructions  are 
stored  per  word.  An  instruction  will  cause  the  beam  position  to 
be  moved  one,  two,  or  three  times,  in  0.010-inch  increments,  in 
one  of  eight  directions.  Direction  0  is  to  the  right,  direction  1  is 
up  and  to  the  right,  etc. 


Table  2    DEC  338  data-mode  instruction  set 


Mode 


Function 


Time  (lis) 


Word 


Instniction  bits: 
0         1  2 


point 


Increment 


vector 


vector 

continue 

short 
vector 

6-bit 

character 

7blt 

character 

graph 
plot 

spare 


6  ^  35 

1.5  -I-  2  X  (.9  -  3.6) 
1  -  150 

1  ~  1,200 

1.8  -  24 
3.75-1- 
4.5-f 
6-35 


1  of  2 

2  of  2 
1 

1  of  2 

2  of  2 

1  of  2 

2  of  2 
1 

1 


Int" 
esc 

Inh' 
inh 

Y  coordinate 
X  coordinate 

Int 

move 
count^ 

move 
direction' 

same  as  bits  0  ~  5 

int 
esc 

-+- 

Delta  Y 
Delta  X 

Int 
esc 

Delta  Y 
Delta  X 

Int 

Delta  Y 

-t- 

esc             Delta  X 

character  1 

character  2 

blank 

character 

esc 

X/Y' 

Y  or  X  coordinate 

"  Intensify;  turn  on  beam. 

'  Inhibit:  do  not  set  value  into  Y  or  X  coordinate. 

'  Escape:  enter  control  state. 

"'O  —  move  1  and  escape:  1,  2,  3,  —  move  1,  2.  3. 

'8  directions. 

'0  -.  set  Y  and  increment  X;  1  ->  set  X  and  increment  Y. 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  1  |  Processors  to  control  terminals  and  secondary  memories 


Vector  mode.  The  vector  mode  is  used  to  draw  straight-line  seg- 
ments. This  two-word  instruction  causes  the  beam  position  to  be 
moved  along  a  line  represented  by  an  1 1-bit  delta  v  and  an  1 1-bit 
delta  X. 

Vector  continue  mode.  This  mode  is  used  to  draw  a  straight  line 
to  the  edge  of  the  screen.  It  is  similar  to  vector  mode  but  causes 
the  line  to  be  extended  until  an  "edge"  is  encountered. 

Short  vector  mode.  The  short  vector  mode  is  used  to  draw  figures 
composed  of  short  line  segments.  A  one-word  instruction  specifies 
a  .5-bit  delta  y  and  a  5-bit  delta  x  quantity.  It  is  transformed  within 
the  display  to  the  same  format  as  vector  mode  and  operates  in 
the  same  manner. 

The  preceding  modes  move  the  beam  by  counting  the  X  and 
Y  position  registers.  The  counting  is  done  at  1.2  microseconds  per 
step  on  an  intensified  move  and  at  0.30  microsecond  per  step  on 
a  nonintensified  move. 

Point  mode.  Point  mode  is  used  for  random  point  plotting.  A 
two-word  instruction  specifies  new  Y  and/or  X  coordinates  to  be 
placed  into  the  Y  and  X  position  registers. 

Craph-plot  mode.  This  is  used  to  draw  curves  of  mathematical 
fimctions.  A  one-word  instruction  has  data  for  the  Y  or  X  position 
register;  at  the  same  time,  X  or  Y,  respectively,  is  incremented 
by  a  count  of  one,  two,  four,  or  eight,  depending  on  the  scale 
factor. 

Point  and  graph-plot  modes  operate  at  a  rate  depending  upon 
the  position  of  the  new  point  with  respect  to  the  previous  point. 
If  a  point  is  only  one-eighth  of  the  screen  away,  the  delay  for 
beam-settling  time  is  6  microseconds;  otherwise  the  settling  time 
is  35  microseconds. 

Character  generation  option  imtructions.  The  alphanumeric  char- 
acters or  special  symbols  which  make  up  a  character  set  are  stored 
in  Mp  in  increment  mode  or  short  vector  mode.  These  characters 
can  be  arbitrarily  defined.  A  6-bit  (or  7-bit)  character  code  in  the 
instruction  is  used  to  locate  a  word  in  a  table  in  Mp  called  the 
dispatch  table.  The  base  address  of  the  table  is  specified  by  the 
Starting  .\ddress  Register/SAR<0:5>. 

SAR  mav  be  loaded  bv  instructions  from  the  Pc.  The  SAR 
represents  the  most  significant  6  bits  of  a  15-bit  memory  address. 
The  character  code  represents  the  least  significant  6  (or  7)  bits. 
A  seventh  SAR  bit,  corresponding  to  the  octal  position  100,  is  used 
with  6-bit  characters  as  a  case  bit  (i.e.,  uppercase  or  lowercase 
characters)  and  may  be  set  or  cleared  with  a  control  character. 


A  word  in  the  dispatch  table  has  the  following  format: 

Bit  0:  If  bit  0  is  a  1,  bits  1  to  11  are  used  to  perform  a  control 
function  as  specified  by  particular  control  instructions. 
If  bit  0  is  a  0,  bits  2  to  11  are  combined  with  SAR  to 
specify  the  address  at  which  the  character  definition 
program  starts.  (The  address  bit  2  is  common  to  both 
the  SAR  and  bit  2  of  the  dispatch  word  and  so  may 
be  specified  in  either  place  or  in  both  places.) 

Bit  1:  Detennines  the  mode  in  which  the  character  is  to  be 
displaved.  If  bit  1  is  a  0,  the  increment  mode  is  used 
to  plot  the  character  used;  if  bit  1  is  a  1,  the  short 
vector  mode  is  used  to  plot  the  character. 

Control-state  instructions 

There  are  six  control-state  instmctions. 

Parameter.  Parameter  is  used  to  set  values  in  scale,  light-pen,  and 
intensity  registers. 

Mode.  Mode  is  used  to  set  up  the  data-state  mode  (or  data-mode 
instruction).  Mode  also  is  used  to  stop  the  display. 

Conditional  skip.  The  skip  instruction  tests  the  state  of  the 
P. display  and  the  pushbuttons. 

Miscellaneous.  These  instructions  include  both  tests  and  additional 
parameter  control. 

Display  jump  and  push-jump  subroutine  instructions.  The  display 
jump  instruction  has  15  address  bits,  so  that  a  jiunp  may  be 
executed  to  any  location  in  the  display  file  within  the  32-kw 
memory. 

The  display  subroutine  instructions  are  push-jump  (an  extension 
of  the  jump  instruction)  and  pop,  the  return  from  subroutine.  The 
push-jump  works  as  follows:  The  current  state  of  the  display  (Light 
Pen  Enable,  Data  Mode,  Scale,  and  Intensity)  is  stored,  along  with 
the  return  address,  in  two  successive  locations  in  the  first  4,096 
words  of  memory.  The  locations  are  determined  by  the  pushdown 
pointer,  PDF.  This  pointer  is  initially  set  by  a  Pc  instruction.  The 
normal  jump  is  then  executed. 

To  return  from  a  subroutine,  the  pop  instruction  is  executed. 
It  has  no  address  bits.  Its  function  is  to  return  the  display  to  a 
previous  state  by  sending  the  last  words  on  the  push-down  stack 
back  to  the  display. 

The  stack  approach  to  subroutining  as  implemented  on  the  338 
has  certain  advantages  over  the  jump  to  subroutine  instruction 
normally  used  in  Pc's: 


Chapter  25  |  The  DEC  338  display  computer  309 


1  Memory  space  is  conserved  since  return  address  locations 
are  not  required  in  each  subroutine  in  memory. 

2  A  subroutine  can  be  called  any  number  of  times  before 
return  to  the  main  routine. 

3  Since  the  state  of  the  display  is  saved  on  the  stack  and 
subsequently  restored,  subroutines  are  truly  transparent; 
that  is,  after  the  return  they  leave  the  state  of  the  display 
program  the  same  as  before  the  subroutine  call. 

4  The  subroutines  can  either  retain  the  same  state  or  change 
the  state  of  the  display  by  using  one  or  more  of  the  "inhibit 
restore"  bits  available  in  the  pop  instruction.  The  program- 
mer can  elect  independently  to  inhibit  restoration  of  mode, 
light  pen,  and  scale,  or  intensitv  information. 

Instructions  in  Pc  for  communicating  with  P(display) 

Instructions  in  Pc  communicate  with  P.display.  The  physical  con- 
nection is  by  the  S('I/0  Bus).  The  in-out  transfer  instructions  in 
Pc  are  used  to  initialize  and  read  the  state  of  P.display. 

P.displaij  state  initittliziition  froui  Pc  instructions 
Set  Push  Down  Pointer  from  AC 


Set  Display  .\ddress  Counter  from  AC 

Set  Push  Button  contents  from  AC 

Set  miscellaneous  flag  and  status  bits  from  AC 

Set  character  generator  S.^R  address 

r.clisphn/  status  to  Pc  instructions 
Read  Push  Down  Pointer  into  AC 
Read  X  register  into  AC 
Read  Y  register  into  .^C 
Read  Display  .\ddress  Counter  into  .\C 

Read  Status  words  1,  2,  3,  4,  5  into  AC  (6()  miscellaneous 
bits  of  flags,  modes,  etc.) 

Picture  debusing  modes.  These  modes  aid  programmed  and  pic- 
ture debugging.  A  bit  can  be  set  to  override  the  nonintensify  bit 
in  data-mode  instructions.  When  this  bit  is  a  1,  all  points  and 
vectors  are  plotted,  whether  they  are  to  be  intensified  or  not.  The 
search  enable  instruction  forces  the  display  to  run  until  a  particu- 
lar instruction  type  is  found.  The  instruction  type  is  specified  by 
the  search  enable  instruction. 


310  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  1  |  Processors  to  control  terminals  and  secondary  memories 


APPENDIX  1    DEC  338  DISPLAY  PROCESSOR  ISP  DESCRIPTION 


Append  i  x 

1 

DEC  338  D' 

splay  Processor 

ISP  Desc 

ription  (partially  complete) 

P. display  State 

Program  Flow  State 

DAC<D:l4> 

Display  Address  Counter;  holds  memory  address  of  display 

instruction 

PDP<0: I 1> 

Push  Down  Pointer  to  stack  holding  subroutine  return  addresses 

Internal ^Stop 

denotes  halt  by  a  P, display  instruction 

External  ^top 

denotes  a  request  by  Pa  for  P. display  to  halt 

Data^tate  and  Control^tate  are  two  mutually  exclusive 

states. 

Data^tate  instructions  are  interpreted  by  P.  display  as  points. 

lines,  and  characters  to  be  displayed  on 

T.    There  are  7  modes  for  specifying  the  data  types.     The  DataJ^ode  register  holds  the 

data  type  being  interpreted,     Control^tate  instimcttons 

include 

jumv  to  subroutines  using  the  stack,  controlling  P. display  state 

registers  and  switching  to  a  specific  data  mode. 

Data^State 

Control^State  :=  -i  Data^State 

DataJ1ode/0M<0:2> 

specifies  interpretation  of  Datat_^tate  mstiTuctions 

SAR<0:5> 

catling  character  display  subroutines 

Picture  State 

,                            f-^           -              n      viv  ^  AO+dimension^l 

X<jD :  1 2> 

beam  position;  only  integers  in  range  U  <.  X.\i  <.  d 

Y<D:12> 

are  plotted 

Vertical ^dge J lag/Vef 

denotes  if  beam  is  within  a  displayahle  area 

Hor  i  zontaI,_,edge,_,f  I  ag/Hef 

set  when  beam  moves  outside  the  display  area 

Edgaj  !nterrupt/EI 

CHSZ 

Character  Size,  0  indicates  6  bit  character  set  1  indicates  7 

ScaleO:  1  > 

used  tv  set  increment  si  ze  for  Data^^^ode  ins  true  tions ,  mere— 

Scale'<0;2>  :=     CSca)e<0:]>+  l) 

^  nScate 
ments  are  x  2 

lnten5itY<D:2> 

brightness  of  displayed  points 

X     imens  ion<D : 1 > 

maximum  dimension  of  plotting  area,  9.375,  28,75,  37,5,  75.0  in 

Y^dimension<0: 1> 

on,  to  display  a  point  or  line;  automatically  turned  off  at 

Beam 

instruction  comoletion 

Console  qnd  Light  Pen  State 

PusK^uttons/PBO:  1 1> 

register  with  lights:  can  be  complemented  manually  or  by 

processor 

Push^Button  J^i  t/PBH 

flag  is  set  by  manually  striking  any  push  button 

Manual  Jnterrupt/MI 

key  which  is  used  to  interrupt  Pc  and  becomes  one  when  struck 

LightJ'en^Find/LPF 

stops  the  display  and  interrupts  Pc  whenever  the  Light  Pen 

has  seen  a  displayed  spot  and  the  Light^Pen^Enable  is  a  one 

Lightc.PeruEnable/LPE 

a  bit  to  enable  the  Light^Pen^Find  flag  to  cause  an  interrupt 

Mp  State 

m[0:7]  [0:i)095j<0:n> 

primary  memory  for  P,  display  and  Pc 

Instruction  Format 

Instruct  ion/i<0: 1  T> 

The  individual  instructions  fields  are  defined  below.  Each 

instruction  type  has  its  own  bit  field  assignments. 

enter,_,data,jState                   ;=  i<l  1> 

common  bits  for  several  instructions 

pb^sense                               :=  i<J> 

push  button  control  bits 

Chapter  25  ,  The  DEC  338  display  computer  311 


APPENDIX  1    DEC  338  DISPLAY  PROCESSOR  ISP  DESCRIPTION  (Continued) 


pb__^cl  ear 

= 

pbj:;ompIement 

=  i<5> 

pb^elect<D:5> 

=  i  <6: 11> 

scaleuchange/sc 

=  i<3>                                  scale  (size)  control  bits 

seal  &jva  1  ue  /sv<0 :  1> 

=  \<U:S> 

I  i ght  jjenuchange/ 1  pc 

=  I <6>                                     tight  pen  test  contrvt  bits 

nghtu.penLjbit/Ipb 

=  i<7> 

Instruction  Interpretation  Proaeee 

(-1  Internali^top  V  -n  Externa l,jStop)  - 

fetch 

(instruction[0:l]  <-  m[dAC  :  DAC+l]  ;   DAC  *-  DAC  +  t;  next 

(ControlLjState  A   ( i  nstruct ion<:0 

1>  = 

2))       (DAC      DAC  +1):          2  V  instruction 

(Dat^State  A   ((Data  Mode  =  O)  v 

(Data^Mode  =  2)  V                        2  w  data 

(Data  Mode  =  3)))  -»  (DAC      DAC  +  1): 

next  Instruct! on^execut i on) 

execute 

Instruction  Set  and  Instruction  Execution  Process 
The  following  instruction  set  definition  is  not  comvlete.    It  does  not  include  the  complete  character  instruction  definition  or 
the  miscellaneous  and  conditional  skip  instructions,    Most  of  the  instructions  are  microcoded. 

1 ns t ruct ion^execut ion  :=  ( 

Control  Instructions 

parameter<0:  n>  :=  i[o]<0:ll> 

set  rararrater  instruction  format 

parameter^opcode 

=   (i<0:2>  =  000) 

parameter^! n tens  I ty^chanqe 

=  parafT>eter<S>                                     varoTreter  execution 

pa  name te  r^i  n  tens  i  ty<0  >7> 

=  parameter<9: 1 1> 

parameter^opcode  A  ControlLjState  -♦ 

( 

scale^change  -♦  (Scale  «- seal e^val  ue) ; 

1  i  ght,_,pen^change      (L  i  gh  t^PentjF 

nd         1  ight,_,pen._.bi  t)  ; 

i ntens  1  ty^change       (intensity  «- 

parameteruiintensi  ty))  ; 

mode<0: 1 1>  :=  i<0: 1 1> 

set  mode  instruction  format 

modeijDpcode 

=  ( 

<0:2>  =  001 ) 

modeLjStopuCode 

=  mode<3> 

modeijClear^ush,Jjut  ton^f  1  ag 

=  mode<^> 

mode^data  jnode^change 

=  mode<5> 

nx3deujSet<0:2> 

=  mode<6:8> 

modeLjC  lea  reactor 

=  mdde<9> 

modeuclear^coordi  nate 

=  fT>ode<10> 

mode^pcode  A  Control  ^State 

set  mode  execution 

modci^topujcode       ( 1  nterna  1  ^Stop 

-1); 

modecjcl  ear^jpushiJjutton^f  I  ag 

(Push^utton^Hi  t       0)  : 

modeudata jnode^hange  -»  (Data^ode  *-fnode^et): 

nx>deuclear^ector  (X<0:2> 

ir-   0;  Y<0:2>  »-  0)  ; 

mode^clear^oordinate       (X<3:  12>      0;  Y<3  : 12>  *-  O) ; 

ehter^ata^^tate  (Data^State 

-  1 ) )  t 

312  Part  4  I  The  instruction-set  processor  level:  special-function  processors  Section  1  |  Processors  to  control  terminals  and  secondary  memories 


APPENDIX  1    DEC  338  DISPLAY  PROCESSOR  ISP  DESCRIPTION  (Continued) 


PB^J<D;n>  :=  I<D;n> 

gTOia^  7  push  huttoy!  test  and  set  instruction  format  for 

PB^l^pcode        (PB^1<0:2>  =  100) 

Push  Buttons  0  to  5 

groKD  2  (not  defined)  is  for  Push  Buttons  6  to  11 

PB^luOpcode  A  Control^State  ->  ( 

PBt_J  instruction  execution 

pb^ense  ®  (pb^select<0:5>  =  {PB<0:5>  A  pb^select<0:5>) ) 

—  (  skip  test 

DAC  •-  DAC  +  2)  ; 

pb^clear  ->  (PB'^:5>  *-PB<D:5>  A  pb  select<D:5>) ;  next 

pb^comp  lemen  t  -»  (PB<0  :  5>  t—  PB<D :  S>  +  pb^_^e  1  ect<D :  5>) ) ; 

jump[0:l]<D:ll>  :=  i[0:l]<0:11> 

.lump  and  stack  push  down  (subroutine  calling)  instruction 

jumpu^p                   :=  (i  [o]<0:2>  =  OlO) 

format 

jumpupush                :=  i  [o]<S> 

jump^f  ield<0:2>     :=   i  [o]  <9  :  1  1  > 

jumpujOp  A  Control>_.State  { 

scale^hanqe  ^  {Scale  *- seal  e^val  ue ) ; 

Jump  and  push  doun  execution 

I  i  ghtcpen^hanqe       f L  i gh t^Pen^F  t  nd       1  ightL^eriu^bi  t)  ; 

DAC  ^  jump^f  ieldni  [l]  ; 

jump^push  ( 

m[pOP  +  l]  f-  DAC<0:2X:LPFDScaleQrata^Modealnten5  i  ty: 

m[pDP  +  2]  ^  DAC<3:  l't>; 

PDP  *-  PDP  +  2) ; 

pop<0:ll>  :=  i[0]<0:ll> 

stack  pop  instruction  format;  subroutine  return 

pop^op^code                       :=   (i<0:2>  =  Oil) 

popu i nh i b i t^mode              :=  Dop<8> 

poptji  nh  i  bi  t^sca  1  e^pen     :=  pop<9> 

pop^i  nh  i  bl  t^ji  ntens  i  ty     :=  pop<10> 

pop^op^code  A  Control^State  -»  ( 

pop  execution 

DAC<3:U>  ^  m[pDP]  ; 

DAC<0:2>  ^  m[pDP-1]  ; 

pop^i  nh  i  bi  t^i  ntens  i  ty  -»  (intensity  *- M  [PDP-l]<? :  1 1>) ; 

pop^Inhiblt^ode  -»  (DataJ^ode  «- M  [PDP-]]<6 :  ?^>) : 

—1  pop^I  nh  i  b  i  t^scal  exchange  ( 

Scale  ^  M[PDP-l]<it:5> 

LPF  *-  m[pDP-]]<3>)  ; 

PDP      PDP  -  2 ;  next 

scale^jChange      (.Scale  (~  scale^val  ue) : 

1  i  qht  jien^change  ^  (LPF      1  1  ght  jien  JjI  t)  ; 

enter^jdataLjTKDde  -♦  (DatauMode       1 ) )  ; 

Data  Mode  Instructions 

point  data  instruction  format 

point  [O:  l]<0:  n>  :=  i[0:l]<0:11> 

point^lntensl  ty              :=  point[o]<0> 

poi  nt^i  nhi  bi  t^y              ;=  point  [o]<l> 

point^y^:9>                 :=  po  i  n  t  [o]  <2  :  n> 

point^<D:9>                  :=  poInt[l]<2;n> 

point^scape                  ;=  point[l]<n> 

poi  nt  J  nhi  bi  t  L>              :=  point[l]<]> 

Chapter  25  |  The  DEC  338  display  computer  313 


APPENDIX  1    DEC  338  DISPLAY  PROCESSOR  ISP  DESCRIPTION  (Continued) 


(DatawMode  =  000)  A  Data^State  -*  ( 

point  data  execution 

-1  point^inhibitwX  -»  {X  «-  poInt^X); 

-«  point^Inhibit^y  -»  (Y  «-  poInt^Y); 

pol  ntwi  ntens  i  fy  (Beam 

^  0; 

point^escape      (Data,_,State  «-0)); 

vector[o]<0:ll>:=  i[0:l]<0:ll> 

vector  data  inatruction  format 

vector^I  ntens  i  fy            :=  vector[o]<0> 

vectofuiescape                :=  vector[l]<0> 

vectorL-dy<D:  10>             :=  vector[o]<l :  n> 

vector^dx<0:1CC>              :=  vector [l] <1  :  n> 

(DatawMode  =  010)  A  Data,^State  -  ( 

vector  data  execution 

Y  (-  Y  +  vector^dy ; 
X  <-  X  +  vector^dx; 

not  correct t  eince  the  vector  from  point  y,X  to  Y+  vector^dy^ 
X+  vector^dx  is  plotted 

vector,., i  n  tens  i  fy      (Beam  «-  1 ) ; 

vector,_,escape  -♦  Data^State  0); 

vector  continue[0:l]<0:ll>  :=  i[0:lj<0:ll> 

vector  continue  instruction  format  same  as  vector 

(Data^Mode  =  Oil)  A  Data^State  { 

vector  continue  execution 

Y  «-  Y  +  s  i  gn^^xtend  (vector^dy)  ; 

X      X  +  s  i  gn^jCXtend  (vector^dx)  ; 

not  correct,  as  vector  continues  plotting  until  edge  is  found 

vector^!  ntens  !  fy  -»  (Beam  «-  1 )  ; 

vector^escape  -»  (Data,^State  <_0)); 

short^vector<0:ll>  :=  l[o]<0 

1 1> 

short  vector  instruction  format 

shor t,_,vec to r,_,i  ntens  i  fy 

=  short^vector<0> 

short^jvector^e  scape 

=  shortyVector<6> 

short^vector^dx 

=  short^vector<8: 1 1> 

shortu^vector^dy 

=  short,_,vector<l  :5> 

(Data^Mode  =  100)  A  Data^State  ( 

short  vector  execution 

X^X  +  s  i  gn^extend  (short^vector^dx)  ; 

Y  ^Y  +  s i gn^extend (short^vector^dy) ; 

short^vector^i  n  tens  i  f  y  -* 

Beam      1 ) ; 

short^vector^escape  -»  (Data^State  ^0)); 

increment<0:5> 

increment  instruction  format;  2  increment/instruction 

Incrementwintensify 

:=  increment<0> 

I  ncrementud  lrection/id<0:y>       :=  i  ncrement<3 :5> 

1  of  8  directions 

I nc  remen  t^coun  t/ 1 c<0 : 1> 

:=  increment<1 :2> 

icle       ;=   (ic  =  0) 

count  I  and  escape  to  Control^tate 

icl         :=   (ic  =  1) 

count  2 

ic2         ;=   (ic  =  2) 

count  2 

ic3         :=   (ic  =  3) 

count  3 

(Data^Mode  =  001)  a  Data^State  ( 

increment  instruction  execution 

increment  *-  i<0:?>;  next  p  1  ot^i  ncrement^jvector  ;  next 

increment  «-  i<6:ll>;  next 

p  I  ot^i  ncrement^jVector) 

314  Part  4     The  instruction-set  processor  level:  special-function  processors  Section  1      Processors  to  control  terminals  and  secondary  memories 


APPENDIX  1    DEC  338  DISPLAY  PROCESSOR  ISP  DESCRIPTION  (Continued) 


p  lot     ncremen  t  ^jVec  tor   :=  ( 

icIe  _>  (move^l  jDOsItlon;   Control^State  *-  1); 

move  2  and  escape 

icl  -*  (move^I  ^pos  i  t  i  on)  ; 

move  1 

I  c2  -*  (move^jl  ^os  i  1 1  on  ;  next  move,_,l  ^pos  i  t  i  on) 

move  2 

ic3  -»  (move^l  j)OS  i  t  i  on  ;  next  move^l  jjos  i  t  i  on  ;  next 

move  3 

nvDvetjl^jpos  i  t  i  on) 

Move^l ^pos I t i on   :=  ( 

sub  process  for  moving  beam 

(id  =  0)  ^  (X  ^  X  +  Scale)  ; 

1  of  8  positions 

(id  =  1)  ->  (X  «-X  +  Scale;  Y  <- Y  +  Scale); 

(id  =  2)       (Y  ^  Y  ■^  Scale) ; 

I  :  A    —     '7\            (V          V    4-    Qi-altj-     Y    ,     V    -  C;/-alcil' 

\io  ~  i )  — »w  <— '  +  ocaie,  A  *— A  iCdie^, 

(id  =  h)  -t  (X  t— X  -  Scale); 

(id  =  5)  — »(Y      Y  -  Scale;  X  <— X  -  Scale); 

(id  =  6)       (Y  <— Y  -  Scale)" 

(Id  =  /)  —*  \i       T       scale;  a  —  a  +  bcaiej; 

character<D:  1  1>  ;=  i  <D  :  1 1  > 

character  instruction  format 

6^bi  t  [O:  l]<0:5>  :=  characte  r<] ;  1  1  > 

7  bit  <S  '11>          :=  characte r<;5  ■  1 1 

(DataJ^ode  =  101)   A  Data^tate  — *  ( 

character  instruction  execution; 

(CHSZ  =  0)  -.  ( 

X  , Y  <—  f  (M  [SARn6i_iti  i  t  [0  ]  ] , M )  ; 

plot  function j 

(CHSZ  =  1)  — t  (X,Y  «— f(M[SAR□7^it],M)))■ 

see  text 

g^aph^plot<0:  n>  :=  i[0]O:]1> 

graph  data  instruction  format 

graphi_,plotue5cape<0>    :=  graphLjplot<D> 

graph^plot^x^y<0>         ;=  graphLjplot<l> 

graph^plot^data<0:9>     :=  graph^p Iot<2 : 1 1> 

(Data^mode  =  110)  A  Oata^State  -»  ( 

graph  data  execution 

—1  graph^plot^jX^y  -»  (X*-X  ■•■  Scale';  Y  <-  graphLjploti-idata; 

Beam  «-  1 ) ; 

graph^plot^Xj_,y       (Y  ^  Y  ■•■  Scale';  X      graph  jilot^data; 

Beam  «-  1 ) ; 

graph^p  1  otuisscape  -»  (Data^State  »-0)) 
) 

end  Ins  true  tioHLjexecution 

Section  2 


Processors  for  array  data 

Two  array  processors  are  discussed  in  this  section.  Concep- 
tually, they  are  an  outgrowth  of  both  the  parallel,  distributed 
computer  [Holland,  1959],  and  the  matrix  interpreter-based 
programs  for  general-purpose  computers.  NOVA  is  a  very  low 
cost  special  processor.  ILLIAC  IV  is  a  very  general  array  proces- 
sor. Another  approach,  the  ILLIAC  III  [McCormick,  1963]  stores 
information  on  photographic  media,  so  that  optical  processing 
(inherently  parallel)  can  be  used. 

NOVA 

NOVA  is  a  proposed,  non-general-purpose  machine  based  on 
the  belief  that  efficient,  special-function  processors  can  be  built 
to  solve  particular  problems. 

It  is  reasonable  to  assume  that  there  are  problems  for  which 
NOVA,  with  its  cyclic  memory,  would  perform  no  worse  than 
a  processor  with  a  random-access  memory.  Unless  the  opera- 
tions performed  on  the  arrays  were  extremely  simple  or  re- 
stricted, a  single  system  might  not  always  work  very  efficiently. 
By  using  a  variable-speed  cyclic  memory  to  match  the  operation 
time  in  the  form  of  an  address  transformation  or  renaming 
mechanism,  the  access  problems  might  be  avoided. 


NOVA  represents  a  particular  idea  for  effective  utilization  of 
hardware  and  is  presented  to  remind  us  that  a  memory  now 
considered  obsolete  may  perform  nicely  for  a  restricted  appli- 
cation. 

The  ILLIAC  IV  computer 

D.  L.  Slotnick  is  responsible  for  the  ILLIAC  IV  computer.  The 
idea  for  a  computer  with  a  number  of  parallel  data  operators 
or  processing  elements  appeared  some  time  ago  in  the  SOLO- 
MON computer  [Gregory  and  McReynolds,  1963].  The  tech- 
nology of  the  first  and  second  generation  made  SOLOMON 
impractical  to  build.  ILLIAC  IV  was  designed  at  the  Univer- 
sity of  Illinois  under  a  contract  to  the  Department  of  Defense's 
Advanced  Research  Projects  Agency.'  The  processing  elements 
are  constructed  from  third-generation  technology  although 
some  medium-  and  large-scale  integrated  circuits  are  used  in 
the  design. 

The  design  is  about  the  most  ambitious  ever  undertaken. 
The  direct  and  indirect  effects  should  be  numerous. 

'The  University  of  Illinois  monitored  the  contract  to  the  Burroughs  Corporation. 
Paoli.  Pa. 


315 


Chapter  26 


NOVA:  a  list-oriented  computer^ 

Joseph  E.  Wirsching 

Since  the  advent  of  the  internally-stored  program  computer,  those 
of  us  concerned  with  problems  involving  massive  amounts  of  com- 
putation have  taken  a  one-operation,  one-operand  approach.  But 
there  is  a  very  large  class  of  problems  involving  massive  amounts 
of  computation  that  may  be  thought  of  as  one-operation,  many- 
operand  in  nature.  Some  familiar  examples  are  numerical  integra- 
tion, matrix  operations,  and  payroll  computation. 

This  article  proposes  a  computer,  called  NOVA,  designed  to 
take  advantage  of  the  one-operation,  many-operand  concept. 
NOVA  would  use  rotating  memory  instead  of  high-cost  random 
access  memory,  reduce  the  number  of  program  steps,  and  reduce 
the  number  of  memory  accesses  to  program  steps.  In  addition  it 
is  shown  that  NOVA  could  execute  typical  problems  of  the  one- 
operation,  many-operand  type  in  times  comparable  to  that  of 
modern  high-speed  random  access  computers. 

Rotating  memories  were  used  in  early  computers  because  of 
low  cost,  reliability,  and  ease  of  fabrication.  These  machines  have 
been  replaced  by  machines  with  more  costly  random  access 
memories  primarily  to  increase  computing  speed  as  the  result  of 
a  decrease  in  access  time  to  both  operands  and  instructions. 

The  NOVA  approach 

Let  us  take  two  simple  examples  and  use  them  to  compare  con- 
ventional computing  techniques  with  those  proposed  for  NOVA. 

Example  1.  Consider  two  lists  (a  s  and  b's)  of  which  the  corre- 
sponding pairs  are  to  be  added.  With  a  conventional  computer 
this  is  done  with  a  program  that  adds  the  first  a  to  the  first  h, 
the  second  a  to  the  second  b.  etc.,  and  counts  the  operations.  The 
working  part  of  such  a  program  might  consist  of  the  following 
instnictions: 

Fetch  a 
Add  b 

Store  {a  +  b) 

Count,  Branch,  and  Index 

^Datamation,  vol.  12,  no.  12,  pp.  41-4.3,  December,  1966. 


In  general,  the  four  or  more  instructions  must  be  brought  from 
the  memory  to  the  instruction  register  once  for  each  pair  in  the 
lists.  This  seems  to  be  a  great  waste  when  only  one  arithmetic 
operation  is  involved.  Indeed  it  is,  when  one  considers  that  the 
majority  of  computing  work  consists  of  the  performance  of  highly 
repetitive  operations  that  are  merely  combinations  of  the  simple 
example  given.  Attempts  have  been  made  to  alleviate  this  waste 
by  incorporating  "instruction  stacks'"  and  "repeat"  commands  into 
the  instruction  execution  units  of  more  recent  computers. 

Example  2.  Consider  three  lists  (a's,  b's  and  c's),  where  we  wish 
to  compute  {a  +  b)  X  c  for  each  trio.  There  are  two  distinct 
methods  by  which  this  can  be  accomplished:  first,  by  forming 
(a  +  b)  X  c  for  each  trio  of  numbers  in  the  list,  or  second,  by 
forming  a  new  list  consisting  of  (a  -f-  b)  for  each  a  and  b,  and  then 
multiplying  each  c  by  the  corresponding  member  of  the  new  list. 
Clearly  the  second  method  is  wasteful  of  memory  space  and 
wasteful  of  programming  steps. 

Next,  let  us  take  a  look  at  the  memory  requirements  for  these 
two  examples.  First,  the  instnictions  are  kept  in  a  high-speed 
random  access  memory,  and  while  the  bulk  of  the  variables  need 
not  be  kept  in  a  random  access  memory,  they  must  be  brought 
to  one  before  the  algorithm  can  be  performed.  This  extra  transfer 
may  entail  more  instructions  to  perform  the  logistics.  Thus  the 
simplicity  of  the  overall  program  is  directly  related  to  the  size 
of  the  memory.  The  variables  (a's,  b's,  etc.)  are  usually  stored  in 
consecutive  memory  locations.  Except  for  indexing  this  ordering 
of  the  data  is  not  exploited. 

In  NOVA,  lists  of  variables  are  kept  on  tracks  of  a  rotating  bulk 
memory.  When  called  for,  the  lists  of  variables  are  streamed 
through  an  arithmetic  unit  and  the  results  immediately  replaced 
on  another  track  for  future  use.  This  process  takes  maximum  ad- 
vantage of  the  sequential  ordering  of  the  variables.  Instructions 
need  only  be  brought  to  the  instruction  execution  unit  once  for 
each  pair  of  lists  rather  than  once  for  each  operand;  thus  the 
instructions  need  not  be  stored  in  a  random  access  memory  but 
may  also  be  stored  on  the  rotating  bulk  memorv.  This  departure 
from  the  requirement  for  random  access  memory  significantly 


Chapter  26  j  NOVA:  a  list-oriented  computer  317 


reduces  the  cost  of  the  computer,  without  sacrificing  speed  of 
problem  sokition. 

Solution  of  a  network  problem 

Before  going  further  into  the  structure  of  NOVA,  let  us  consider 
a  significant  example,  which  shows  that  NOVA  is  well  suited  to 
the  solution  of  differential  equations  using  difference  methods  over 
a  rectangular  network. 

Let  Fig.  1  represent  an  artificial  network  used  as  a  model  for 
some  physical  process.  Generally  speaking,  the  method  of  advanc- 
ing the  variables  at  a  mesh  point  ( /,  k)  from  one  time  step  to  the 
next  involves  only  information  from  the  neighboring  mesh  points. 
A  typical  hydrodynamics  problem  will  require  a  list  of  10  to  20 
variables  (physical  quantities)  at  each  mesh  point.  The  traditional 
computer  solution  involves  listing  these  variables  to  each  point 
in  a  contiguous  fashion  and  in  a  regular  seciuence  with  respect 
to  the  rows  and  columns  of  the  array.  If  the  total  array  does  not 
fit  into  the  fast  memory,  three  adjacent  columns  (or  rows)  are 
brought  to  the  fast  memory;  as  a  new  column  is  calculated,  the 
next  column  in  sequence  is  brought  in  from  bulk  memory  and  the 
oldest  of  the  three  is  written  to  bulk  memory.  In  this  fashion  one 
proceeds  across  the  array.  This  process  is  then  repeated  until  some 
significant  physical  occurrence  happens  and  the  problem  is  ended. 

In  NOV.'\,  the  variables  are  organized  into  separate  lists  rather 
than  by  mesh  point.  From  a  computational  standpoint  this  is 
possible  since  the  main  memory  of  NOVA  may  be  essentially 
unlimited  in  size,  at  least  exceeding  the  size  of  the  largest  present 
network  problems.  One  then  proceeds  to  execute  operations  on 


K 


3 
2 


2     3  I  J 


Fig.  1.  Two-dimensional  array. 


V  Shifted 

V  Shifted 

V  Shifted 

Original  Lists 

Down  by  1 

Down  By  2 

Down  By  K 

Vo.o  Vo.o 

— 

— 

— 

Vo.i  Vo.i 

Vo.o 

Vo,2  Vo.2 

Vo.i 

Vo.o 

— 

*  0.1 

t'l.o  V,,o 

^O.K-l 

Vo.o 

l/i.i  V,., 

V,.o 

Vo.K 

Vo.i 

V.  . 
*  1.1 

'  1.0 

Vj-ij. 

I'j.K  Vj.K 

Vj,K-l 

Vj.K-2 

Vj-,.K 

Vj.K 

Vj.K-1 

Vj.K 

Fig.  2.  Lists  of  variables. 


lists  of  variables  rather  than  single  variables,  performing  a  single 
operation  for  all  mesh  points  in  the  array  in  sequence. 

Let  us  look  more  closely  at  the  variables  and  their  possible 
combinations.  Let  L'^j.  and  Vj^  be  variables  associated  with  the 
array  of  Fig.  1.  These  variables  are  listed  setjuentiallv  by  column 
in  Fig.  2,  along  with  further  lists  of  the  \'  column  shifted  by  various 
increments. 

With  some  concentration,  one  discovers  in  Fig.  2  that  an  arith- 
metic operation  between  l'^  ^  and  V'^  j.  is  simply  a  matter  of  taking 
the  two  columns  as  they  e.xist  and  operating  on  them  in  pairs.  To 
combine  (.'j^.  with  a  nearby  neighbor.  the  V  column  is 

shifted  down  one  place,  at  which  time  the  proper  neighboring 
variables  are  found  opposite  one  another  for  the  entire  network. 
At  certain  boundaries  of  the  array  some  elements  have  no  proper 
neighbors.  In  NOV.\  these  boundarv  elements  must  be  handled 
separately  in  the  same  way  as  they  must  be  handled  separately 
in  a  conventional  machine.  In  NOVA,  calculations  at  boundaries 
may  be  temporarilv  inhibited  by  having  a  third  input  to  the  arith- 
metic unit  which  allows  the  calculation  of  a  result  for  a  pair  of 
operands  to  proceed  or  not,  as  appropriate.  This  third  input  is 
defined  as  "conditions,"  and  is  brought  as  a  bit  string  to  the  arith- 
metic unit  concurrently  with  the  operands.  This  bit  string  may 
contain  any  number  from  one  to  several  bits  for  each  pair  of 
operands. 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  2  |  Processors  for  array  data 


Further  observation  shows  not  only  that  it  is  possible  to  obtain 
the  nearest  neighbors  easily  by  shifting  the  columns  of  variables 
with  respect  to  one  another,  but  that  any  neighbor  relationship 
can  be  obtained.  In  general,  for  an  operation  with  a  neighbor  ±n 
rows  away  and  ±m  columns  away,  the  lists  are  offset  by 
±n  ±  m-  K,  where  K  is  the  number  of  rows  in  the  array. 

Many  problems  (for  example,  payroll  and  inventory  records) 
are  essentially  list-structured  but  do  not  require  offsetting  of  vari- 
ables. Clearly  the  NOVA  structure  is  well  suited  for  the  solutions 
of  these  problems  also. 

Structure 

The  most  difficult  problem  to  be  solved  in  the  proposed  computer 
is  to  synchronize  movement  of  the  columns  of  data  that  require 
offset.  Buffers  of  various  types  could  be  used  to  solve  this  problem; 
they  could  range  all  the  way  from  rotating  memory  devices  or 
delay  lines  to  core  memories.  The  former  are  simple,  direct,  and 
low  in  cost  but  are  limited  in  their  general  capabilities.  On  the 
other  hand,  a  number  of  small  random  access  buffer  memories 
could  be  used  for  offsetting  lists  of  variables  and  for  facilitating 
special  functions  such  as  boundary  calculations  but  at  a  higher 
equipment  cost. 

Figure  3  shows  a  block  diagram  of  the  organization  of  NOVA. 
The  rotating  memory,  which  might  be  a  disc  or  drum,  would  be 


Fig.  3.  Block  diagram  of  NOVA  computer. 


1  CONTROL 

i  


Fig.  4.  Buffering  in  arithmetic  unit. 

composed  of  several  hundred  tracks,  each  storing  several  thousand 
words,  with  a  total  capacity  between  one  and  two  million  words. 
Each  track  would  have  an  individual  read-write  head.  The  heads 
would  be  organized  in  such  a  way  as  to  attain  a  high  word-transfer 
rate,  perhaps  as  high  as  one  million  words  per  second.  With  this 
in  mind  an  ideal  execution  time  for  one  addition  would  be  the 
time  required  to  move  two  operands  from  the  disc  to  the  arith- 
metic unit;  i.e.,  1-2  microseconds.  The  disc  svnchronizer  would 
be  capable  of  simultaneously  reading  two  lists  of  operands,  writing 
one  list  of  results,  and  reading  one  list  and  writing  one  list  of 
conditional  control  information.  In  addition,  instructions  would 
be  read  from  another  channel  in  small  blocks. 

The  bit  string  of  conditions  coming  from  the  memory  is  used 
to  control  individual  operations  on  pairs  of  operands  in  the  lists, 
and  in  essence  each  bit  (or  bits)  is  a  subordinate  part  of  the  indi- 
vidual operations.  Conditions  going  to  the  memorv  are  the  sub- 
sidiary result  of  the  operation  of  one  list  upon  another.  These  bit 
strings  may  be  used  later  as  control  during  another  list  operation. 
They  want  also  to  contain  information  on  the  occurrence  of  an 
overflow  or  underflow,  or  on  the  presence  of  an  illegal  operand, 
etc. 

Figure  4  shows  a  suggested  organization  for  the  arithmetic  unit 
that  incorporates  five  sets  of  alternating  buffers.  Two  sets  are  for 
lists  of  operands  coming  from  the  memory,  one  set  for  lists  of 
results  going  to  the  memory,  and  two  sets  for  "conditions"  (condi- 
tional control  information)  coming  from  and  going  to  the  memory. 


Chapter  26  |  NOVA:  a  list-oriented  computer  319 


These  buffers  should  be  equivalent  in  length  to  the  number  of 
words  on  a  track  of  the  rotating  memory. 

The  loading  and  unloading  of  the  buffers  to  and  from  the  rotat- 
ing memory  is  dependent  on  the  timing  of  the  rotating  memory, 
whereas  the  loading  and  luiloading  of  the  buffers  to  and  from  the 
arithmetic  unit  is  guided  solely  by  the  rate  at  which  the  arithmetic 
can  be  performed.  Here  again  it  may  also  be  possible  to  take 
advantage  of  the  streaming  nature  of  the  operands  by  designing 
an  "assemblv-line  '  arithmetic  unit  in  which  more  than  one  pair 
of  operands  could  be  in  process  at  the  same  time.  With  this  kind 
of  unit  it  may  be  possible  to  execute  additions  at  a  rate  equal  to 
the  word-transfer  rate  from  the  rotating  memory;  however,  a 
multiplication  or  division  of  two  lists  may  require  several  revolu- 
tions of  the  memory.  The  timing  diagram  of  Fig.  5  shows  several 
typical  instructions  being  carried  out.  A  certain  amount  of  look- 
ahead  is  required,  but  there  is  ample  time  for  this,  since  instruc- 
tions are  prepared  for  execution  at  an  average  rate  of  less  than 
one  per  revolution  of  the  rotating  memory. 

While  a  detailed  cost  estimate  has  not  been  made  for  a  simple 
prototype  NOVA,  a  quick  estimate  would  be  $5(),()()0  for  a  head- 
per-track  disc  and  $5(),()()()  for  the  arithmetic  and  control  section, 
making  a  total  of  $1()(),()()().  For  a  buffering  scheme  such  as  the 
one  shown  in  Fig.  4  the  cost  would  be  considerably  higher  but 
would  be  offset  by  increased  versatility. 

Conclusions 

In  the  previous  paragraphs  we  have  demonstrated  that  \0\'.\  is 
capable  of  handling  network  problems  at  a  significantly  lower  cost 
than  contemporary  computers,  and  at  a  comparable  speed.  The 
availability  of  such  a  machine  as  NOVA  would  stimulate  further 


BUFFER 

REVOLUTIONS  OF  ROTATING  MEMORY 

2 

3 

5 

e 

Al 
i2 

Bi 

B2 
Cl  lEH 
C2  (E2I 

01 

D2 

 1_ 

 1 

 h 

 1 

^  H 

 1 

I-  ' 

 1 

 Ih 

 1 

 ^^ 

^  II 

 1 

 H 

 1 

 -(t 

 ■+■ 

 1 

K  1 

 1 

 i 

> 

 1 



1 

 1 

 1 

ARITHMETIC 
UNIT 

,A 

»ei-oi,  ,A 

2  82-02,  . 

AliBI- 

Dl           .  .0 

•A2-02  . 

1  1 

1  1 

Fig.  5.  Timing  diagram  of  buffers,  rotating  memory,  and  arithmetic  unit. 
Dotted  line  shows  movement  of  data  Into  a  device;  solid  line  shows 
movement  out. 

interest  in  the  one-operation,  many-operand  approach  to  compu- 
tation and  no  doubt  would  uncover  many  other  problems  to  which 
it  could  be  applied. 

Because  .\OV.\  makes  it  possible  to  easily  establish  neighbor- 
relationships  between  mesh  points  that  are  further  away  than 
nearest  neighbors,  it  may  be  possible  to  develop  new  differencing 
techniques  for  the  solution  of  coupled  sets  of  differential  equations. 
This  may  increase  the  accuracy  or  shorten  the  time  required  for 
their  solution. 

The  memory,  arithmetic,  and  other  units  needed  for  NOVA  are 
commercially  available  now.  No  new  technology  would  be  required 
to  fabricate  a  prototype  model.  In  view  of  the  potential  advantages 
of  such  a  machine,  it  seems  clear  that  constniction  of  a  model 
would  justif\  the  minimal  development  costs. 


Chapter  27 

The  ILLIAC  IV  computer^ 


George  H.  Barnes  /  Richard  M.  Broivn  /  Maso  Kato 
David  J.  Kuck  /  Daniel  L.  Slotnick  /  Richard  A.  Stokes 

Summary  The  structure  of  ILLIAC  IV,  a  parallel-array  computer  con- 
taining 256  processing  elements,  is  described.  Special  features  include 
multiarray  processing,  multiprecision  arithmetic,  and  fast  data-routing 
interconnections.  Individual  processing  elements  execute  4  X  10^  instruc- 
tions per  second  to  yield  an  effective  rate  of  10^  operations  per  second. 

Index  terms  Array,  computer  structure,  look-ahead,  machine  lan- 
guage, parallel  processing,  speed,  thin-film  memory. 

Introduction 

The  study  of  a  number  of  well-formulated  but  computationally 
massive  problems  is  limited  by  the  computing  power  of  currently 
available  or  proposed  computers.  Some  involve  manipulations  of 
very  large  matrices  (e.g.,  linear  programming);  others,  the  solution 
of  sets  of  partial  differential  equations  over  sizable  grids  (e.g., 
weather  models);  and  others  require  extremely  fast  data  correlation 
techniques  (phased  array  signal  processing).  Substantive  progress 
in  these  areas  requires  computing  speeds  several  orders  of  magni- 
tude greater  than  conventional  computers. 

At  the  same  time,  signal  propagation  speeds  represent  a  serious 
barrier  to  increasing  the  speed  of  strictly  sequential  computers. 
Thus,  in  recent  years  a  variety  of  techniques  have  been  introduced 
to  overlap  the  functions  required  in  sequential  processing,  e.g., 
multiphased  memories,  program  look-ahead,  and  pipeline  arith- 
metic units.  Incremental  speed  gains  have  been  achieved  but  at 
considerable  cost  in  hardware  and  complexity  with  accompanying 
problems  in  machine  checkout  and  reliability. 

The  use  of  explicit  parallelism  of  operation  rather  than  over- 
lapping of  subfimetions  offers  the  possibility  of  speeds  which  in- 
crease linearly  with  the  number  of  gates,  and  consequently  has 
been  explored  in  several  designs  [Slotnick  et  al.,  1962;  Unger,  1958; 
Holland,  1959;  Murtha,  1966].  The  SOLOMON  computer  [Slotnick 
et  al.,  1962],  which  introduced  a  large  degree  of  overt  parallelism 
into  its  structure,  had  four  principal  features. 

1  A  large  array  of  arithmetic  units  was  controlled  by  a  single 
HEEE  Trans..  C-17,  vol.  8,  pp.  746-757,  August,  1968. 


control  unit  so  that  a  single  instruction  stream  sequenced 
the  processing  of  many  data  streams. 

2  Memory  addresses  and  data  common  to  all  of  the  data 
processing  were  broadcast  from  the  central  control. 

3  Some  amount  of  local  control  at  the  individual  processing 
element  level  was  obtained  by  permitting  each  element  to 
enable  or  disable  the  execution  of  the  common  instructions 
according  to  local  tests. 

4  Processing  elements  in  the  array  had  nearest-neighbor  con- 
nections to  provide  moderate  coupling  for  data  exchange. 

Studies  with  the  original  SOLOMON  computer  indicated  that 
such  a  parallel  approach  was  both  feasible  and  applicable  to  a 
variety  of  important  computational  areas.  The  advent  of  LSI  cir- 
cuitry, or  at  least  medium-scale  versions,  with  gate  times  of  the 
order  of  2  to  5  ns,  suggested  that  a  SOLOMON-type  array  of 
potentially  10^  word  operations  per  second  could  be  realized.  In 
addition,  memory  technology  had  advanced  sufficiently  to  indicate 
that  10''  words  of  memory  with  200  to  .500-ns  cycle  times  could 
be  produced  at  acceptable  cost.  The  ILLIAC  IV  Phase  I  design 
study  during  the  latter  part  of  1966  resulted  in  the  design  discussed 
in  this  paper.  The  machine,  to  be  fabricated  by  the  Defense  Space 
and  Special  Systems  Division  of  Burroughs  Corporation,  Paoli,  Pa., 
is  scheduled  for  installation  in  early  1970. 

Summary  of  the  ILLIAC  IV 

The  ILLIAC  IV  main  structure  consists  of  256  processing  elements 
arranged  in  four  reconfignrable  SOLOMON-type  arrays  of  64 
processors  each.  The  individual  processors  have  a  240-ns  ADD 
time  and  a  400-ns  MULTIPLY  time  for  64-bit  operands.  Each 
processor  requires  approximately  10"*  ECL  gates  and  is  provided 
with  2048  words  of  240-ns  cycle  time  thin-film  memory. 

Instruction  and  addressing  control 

The  ILLIAC  IV  array  possesses  a  common  control  unit  which 
decodes  the  instructions  and  generates  control  signals  for  all 


Chapter  27  |  The  ILLIAC  IV  computer  321 


processing  elements  in  the  array-  This  ehminates  the  cost  and 
complexity  for  decoding  and  timing  circuits  in  each  element. 

In  addition,  an  index  register  and  address  adder  are  provided 
with  each  processing  element,  so  that  the  final  operand  address 
Oj  for  element  i  is  determined  as  follows: 

a,  =  a  +  (b)  +  (q) 

where  a  is  the  base  address  specified  in  the  instruction,  il)}  is  the 
contents  of  a  central  index  register  in  the  control  unit,  and  (c^) 
is  the  contents  of  the  local  index  register  of  the  processing  ele- 
ment i.  This  independence  in  operand  addressing  is  very  effective 
for  handling  rows  and  columns  of  matrices  and  other  multidimen- 
sional data  structures  [Kuck,  1968]. 

Mode  control  and  data  conditional  operations 

Although  the  goal  of  the  ILLL^C  IV  structure  is  to  be  able  to 
control  the  processing  of  a  number  of  data  streams  with  a  single 
instruction  stream,  it  is  sometimes  necessarv  to  exclude  some  data 
streams  or  to  process  them  differently.  This  is  accomplished  bv 
providing  each  processor  with  an  EN.VBLE  flip-flop  whose  value 
controls  the  instruction  execution  at  the  processor  level. 

The  ENABLE  bit  is  part  of  a  test  result  register  in  each 
processor  which  holds  the  results  of  tests  conditional  on  local  data. 
Thus  in  ILLIAC  IV  the  data  conditional  jumps  of  conventional 
computers  are  accomplished  by  processor  tests  which  enable  or 
disable  local  execution  of  subsequent  commands  in  the  instniction 
stream. 

Routing 

Each  processing  element  /  in  the  ILLIAC  I\  has  data  routing 
connections  to  4  of  its  neighbors,  processors  i  -|-  1,  i  —  1,  i  -I-  8, 
and  i  —  8.  End  connection  is  end  around  so  that,  for  a  single  array, 
processor  63  connects  to  processors  0,  62,  7,  and  55. 

Interprocessor  data  transmissions  of  arbitrary  distance  are  ac- 
complished by  a  sequence  of  routings  within  a  single  instruction. 
For  a  64-processor  array  the  maximum  number  of  routing  steps 
required  is  7;  the  average  overall  possible  distances  is  4.  In  actual 
programs,  routing  bv  distance  1  is  most  common  and  distances 
greater  than  2  are  rare. 

Common  operand  broadcasting 

Constants  or  other  operands  used  in  common  bv  all  the  processors 
are  fetched  and  stored  locallv  bv  the  central  control  and  broadcast 
to  the  processors  in  conjunction  with  the  instniction  using  them. 
This  has  several  advantages:  (1)  it  reduces  the  memory  used  for 


storage  of  program  constants,  and  (2)  it  permits  overlap  of  common 
operand  fetches  with  other  operations. 

Processor  partitioning 

Many  computations  do  not  require  the  full  64-bit  precision  of  the 
processors.  To  make  more  efficient  use  of  the  hardware  and  speed 
up  computations,  each  processor  may  be  partitioned  into  either 
two  .32-bit  or  eight  8-bit  subprocessors,  to  yield  512  32-bit  or 
2048  8-bit  subprocessors  for  the  entire  ILLIAC  IV  set. 

The  subprocessors  are  not  completelv  independent  in  that  thev 
share  a  common  index  register  and  the  64-bit  data  routing  paths. 
The  .32-bit  subprocessors  have  separate  enabled  disabled  modes 
for  indexing  and  data  routing;  the  8-bit  subprocessors  do  not. 

Array  partitioning 

The  256  elements  of  ILLI.AC  IV  are  grouped  into  four  separate 
subarrays  of  64  processors,  each  subarray  having  its  own  control 
unit  and  capable  of  independent  processing.  The  subarrays  may 
be  dvTiamically  united  to  form  two  arrays  of  128  processors  or  one 
array  of  256  processors.  The  following  advantages  are  obtained. 

1  Programs  with  moderateK  dimensioned  vector  or  matrix 
variables  can  be  more  efficiently  matched  to  the  arrav  size. 

2  Failure  of  any  subarrav  does  not  preclude  continued  proc- 
essing bv  the  others. 

This  paper  summarizes  the  structure  of  the  entire  ILLI.\C  IV 
system.  Programming  techniques  and  data  structures  for  ILLIAC 
IV  are  cwered  in  a  paper  bv  Kuck  [1968]. 

ILLIAC  IV  structure 

The  organization  of  the  ILLI.\C  IV  system  is  indicated  in  Fig.  1. 
The  individual  processing  elements  (PEs)  are  grouped  in  four 
arrays,  each  containing  64  elements  and  a  control  unit  (CUt.  The 
four  arrays  may  be  connected  together  under  program  control  to 
permit  multiprocessing  or  single-processing  operation.  The  system 
program  resides  in  a  general-purpose  computer,  a  Burroughs 
B  6500,  which  supervises  program  loading,  arrav  configuration 
changes,  and  I/O  operations  internal  to  the  ILLIAC  IV  system 
and  to  the  external  world.  To  provide  backup  memor\'  for  the 
ILLIAC  IV  arravs,  a  large  parallel-access  disk  svstem  ( 10  bits.  10^ 
bit  per  second  access  rate,  40-ms  ma.ximum  latencvi  is  directlv 
coupled  to  the  arravs.  There  is  also  provision  for  real-time  data 
connections  directlv  to  the  ILLIAC  IV  arravs. 


322  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  2  |  Processors  for  array  data 


ARRAYq 
64  PES 


ARRAYj 
64  PES 


X 


ARRAY 1 
64  PE» 


ARRAYj 
64  PE« 


1/0  SWITCH 

REAL  TIME  LINK 

TO  PERIPHERALS 
AND  COMPUTER  NET 


Fig.  1.  ILLIAC  IV  system  organization. 


Array  organization 

The  internal  structure  of  an  arrav  is  indicated  in  Fig.  2.  The  64 
processing  elements  in  each  array  are  arranged  in  a  string  and 
are  controlled  by  the  control  unit  (CU)  which  receives  the  instruc- 
tion string,  generates  the  appropriate  control  signals  and  address 
parameters  of  the  instructions,  and  transmits  them  to  the  indi- 
vidual processing  elements  for  execution.  In  addition,  each  CU 
can  broadcast  via  the  common  data  bus  operands  for  common  use 
(e.g.,  constant). 

Full  word  length  (64  bits)  communication  exists  between  the 
processing  elements  for  exchange  of  information  by  organized  rout- 
ing of  words  along  the  string  array.  Direct  routing  connections 
exist  for  nearest  neighbors  and  also  for  processing  elements  8  units 
away.  Routing  for  intermediate  distances  are  generated  via  se- 
quences of  routes  of  —1,  +8,  or  —8.  The  end  connections 
of  the  string  are  circular,  but  can  be  broken  and  connected  to 
the  ends  of  other  arrays  when  the  system  is  organized  in  one  of 
the  multiarray  configurations. 


All  processing  elements  of  an  array  execute,  of  course,  the  same 
instruction  in  unison  under  the  control  of  the  CU;  local  control 
is  provided  by  the  mode  bit  in  each  processing  element  which 
enables  or  disables  the  execution  of  the  current  instruction.  The 
control  unit  is  able  to  sense  the  mode  bits  of  all  processing  ele- 
ments under  its  control  and  thereby  monitor  the  state  of  operation. 

Multiarray  configurations 

To  permit  more  optimal  matching  of  array  size  to  problem  struc- 
ture, the  four  arrays  may  be  united  in  three  different  configura- 
tions, as  shown  in  Fig.  3.  To  enlarge  the  arrays,  the  end  connections 
of  the  PE  strings  are  decoupled  and  attached  to  the  ends  of  the 
other  arrays  to  form  strings  of  128  or  256  processors.  For  multiarray 
configurations  all  CUs  receive  the  same  instruction  string  and  any 
data  centrally  accessed.  The  control  units  execute  the  instructions 
independently,  however,  with  inter-CU  synchronization  occurring 
only  on  those  instnictions  in  which  data  or  control  information 
must  cross  array  boundaries.  This  simplifies  and  speeds  up  the  in- 
struction execution  in  multiarray  configurations.  The  multiplicity 
of  arrav  configurations  introduces  complexities  in  memory  ad- 
dressing which  will  be  discussed  in  a  later  section. 

Control  unit 

The  array  control  unit  (CU)  has  the  following  five  fimctions. 

1  To  control  and  decode  the  instruction  streams 

2  To  generate  the  control  pulses  transmitted  to  the  processing 
elements  for  instruction  execution 

3  To  generate  and  broadcast  those  components  of  memory 
addresses  which  are  common  to  all  processors 

4  To  manipulate  and  broadcast  data  words  common  to  the 
calculations  of  all  the  processors 


ROUTING  NETWORK 


iL  .  jr 


EOPER, 
FETCH/' 


CONTROL  UNIT  BUS  (INSTRUCTION  AND  COMMON  OPERANDS) 


Fig.  2.  Array  structure. 


Chapter  27  |  The  ILLIAC  IV  computer  323 


FOUR  QUADRANT  ARRAYS 


TWO  QUADRANT  ARRAYS 


SINGLE  QUADRANT  ARRAYS 


Fig.  3.  Multiarray  configurations. 


5  To  receive  and  process  trap  signals  arising  from  arithmetic 
faults  in  the  processors,  from  internal  I/O  operations,  and 
from  the  B  6500. 


the  final  instruction  station  (FINST)  which  controls  the  broadcast 
of  address  or  data  and  holds  the  PE  instruction  during  the  execu- 
tion period. 

The  use  of  the  PE  instruction  ijueue  permits  overlap  between 
the  CU  and  PE  instruction  executions;  the  amount  of  overlap 
depends,  of  course,  on  the  distribution  of  CU  and  PE  instructions. 
As  in  all  overlap  strategies,  careful  attention  to  the  instniction 
sequence  bv  the  programmer  or  compiler  can  result  in  consider- 
able speedup  of  program  execution. 

The  instruction  buffer  holds  a  maximum  of  128  instructions, 
sufficient  to  hold  the  inner  loop  of  many  programs.  For  such  loops, 
after  initial  loading,  instructions  are  fetched  from  the  buffer  with 
minimal  delay. 

.\  variety  of  strategies  for  instniction  buffer  loading  were  ex- 
amined, and  the  following  straightforward  approach  was  taken. 
W  hen  the  instruction  coimter  is  halfwav  through  a  block  of  8 


The  stnicture  of  the  control  unit  is  shown  in  Fig.  4.  Principal 
components  of  the  CU  are  two  fast-access  bufFers  of  64  words  each, 
one  associatively  addressed,  which  holds  current  and  pending 
instructions  (PLA),  and  the  other  a  local  data  buffer  (LDB).  The 
four  64-bit  accumulator  registers  (C.\R)  are  central  to  communi- 
cation within  the  CU  and  hold  address  indexing  information  and 
active  data  for  logical  manipulation  or  broadcasting.  The  CU 
arithmetic  unit  (CULOG)  performs  addition,  subtraction,  and 
Boolean  operations;  more  complex  data  manipulations  are  rele- 
gated to  the  PE's.  To  specifv  and  control  arrav  configurations,  there 
are  three  4-bit  configuration  control  registers  whose  use  will  be 
described  in  another  section. 


Instruction  processing 

All  instructions  are  32  bits  in  length  and  belong  to  one  of  two 
classes:  CU  instructions,  which  generate  operations  local  to  the 
CU  (e.g.,  indexing,  jumps,  etc.),  and  PE  instructions,  which  are 
decoded  in  the  CU  and  then  transmitted  via  control  pulses  to  all 
the  processing  elements.  Instructions  flow  from  the  array  memory 
upon  demand  in  blocks  of  8  words  (16  instructions)  into  the  in- 
struction buffer.  As  the  control  advances,  individual  instructions 
are  extracted  from  the  instruction  buffer  and  sent  to  the  advanced 
instruction  station  (.\DV.\ST)  which  decodes  them  and  executes 
those  instructions  local  to  the  CU.  In  the  case  of  PE  instnictions, 
ADV.\ST  constmcts  the  necessary  address  or  data  operands  and 
stacks  the  result  in  a  queue  (FINQ)  to  await  transmission  to  the 
PEs.  PE  instructions  are  taken  from  the  bottom  of  the  stack  to 


INSTRUCTION 
SUFFER 
IPlSI 


FINAL  QUEUE 
(FINQI 


FROM  PE  MEMORY 


ASSOCIATIVE 
MEMORY 
ICAMI 


1512 


LOCAL 
DATA 
BUFFER 

(LDB) 


MEMORY  ACCESS 
CONTROL 

(MSU) 


Fig.  4.  Control-unit  block  diagram. 


Part  4  j  The  instruction-set  processor  level:  special-function  processors 


Section  2  j  Processors  for  array  data 


words  (16  instmctions),  fetch  of  the  next  block  is  initiated;  the 
possibility  of  pending  jumps  to  different  blocks  is  ignored.  If  the 
next  block  is  found  to  be  already  resident  in  the  buffer,  no  further 
action  is  taken;  else  fetch  of  the  next  block  from  the  array  memory 
is  initiated.  On  arrival  of  the  requested  block,  the  instruction 
buffer  is  cyclically  filled;  the  oldest  block  is  assumed  to  be  the 
least  required  block  in  the  buffer  and  is  overwritten.  Jump  instruc- 
tions initiate  the  same  procedures. 

Fetch  of  a  new  instruction  block  from  memory  requires  a  delay 
of  approximately  three  memory  cycles  to  cover  the  signal  trans- 
mission times  between  the  array  memory  and  the  control  unit. 
On  execution  of  a  straight  line  program,  this  delay  is  overlapped 
with  the  execution  of  the  8  instructions  remaining  in  the  current 
block. 

In  a  multiple-array  configi^iration.  instructions  are  fetched  from 
the  array  memory  specified  by  the  program  counter,  and  broadcast 
simultaneously  to  all  the  participating  control  units.  Instruction 
processing  thereafter  is  identical  to  that  for  single-array  operation, 
except  that  synchronization  of  the  control  units  is  necessary 
whenever  information,  in  the  form  of  either  data  or  control  signals, 
must  cross  array  boundaries.  CU  synchronization  must  be  forced 
at  all  fetches  of  new  instruction  blocks,  upon  all  data  routing 
operations,  all  conditional  program  transfers,  and  all  configuration- 
changing  instructions.  With  these  exceptions,  the  CUs  of  the 
several  arrays  run  independently  of  one  another.  This  simplifies 
the  control  in  the  multiple-array  operation;  furthermore,  it  permits 
I/O  transactions  with  the  separate  array  memories  without  steal- 
ing memory  cycles  from  the  nonparticipating  memories. 

Memory  addressing 

Both  data  and  instructions  are  stored  in  the  combined  memories 
of  the  array.  However,  the  CU  has  access  to  the  entire  memory, 
while  each  PE  can  only  directly  reference  its  own  2,048-word  PEM . 
The  memory  appears  as  a  two-dimensional  array  with  CU  access 
sequential  along  rows  and  with  PE  access  down  its  own  column. 
In  multiarray  configurations  the  width  of  the  rows  is  increased 
by  multiples  of  64. 

The  resulting  variable-structure  addressing  problem  is  solved 
by  generating  a  fixed-form  20-bit  address  in  the  CU  as  shown  in 
Fig.  5.  The  lower  6  bits  identify  the  PE  column  within  a  given 
array.  The  next  2  bits  indicate  the  array  number,  and  the  remain- 
ing higher-order  bits  give  the  row  value.  The  row  address  bits 
actually  transmitted  to  the  PE  memories  are  configuration- 
dependent  and  are  gated  out  as  shown. 

Addresses  used  by  the  PE's  for  local  operands  contain  three 
components:  a  fixed  address  contained  in  the  instruction,  a  CU 


Array  Column 


(2) 


-Single  orroy 


,  Array 
conf  igurotions 


Address  bits  (12) 
to  PEs 


Fig.  5.  Memory  address  structure. 

index  value  added  from  one  of  the  CU  accumulators,  and  a  local 
PE  index  value  added  at  the  PE  prior  to  transmission  to  its  own 
memory. 

CV  data  operations 

The  control  unit  can  fetch  either  individual  words  or  blocks  of 
8  words  from  the  array  memory  to  the  local  data  buffer.  In  addi- 
tion, it  can  fetch  I  bit  selected  from  the  8-bit  mode  register  of 
each  processing  element  to  form  a  64-bit  word  read  into  the  CU 
accumulator.  The  CU  program  counter  (PGR)  and  the  configura- 
tion registers  are  also  directly  addressable  by  the  CU.  Data 
manipulations  { -f ,  — ,  Boolean)  are  performed  on  a  selected  CAR 
and  the  result  returned  to  the  CAR.  Data  to  be  broadcast  to  the 
processing  elements  is  inserted  into  the  FINQ  along  with  the 
accompanying  instruction  and  transmitted  to  the  PEs  at  the  appro- 
priate time. 

Configuration  control 

With  the  variety  of  array  configurations  for  ILLIAC  IV,  it  is 
necessary  to  specify  and  control  the  subarrays  which  are  conjoined 
and  to  designate  the  instruction  and  data  addressing.  For  this 
purpose  each  CU  has  three  configuration  control  registers  (CFC), 
each  of  4-bit  length,  where  each  bit  corresponds  to  one  of  the  four 
subarrays.  The  CFC  registers  may  be  set  by  the  B  6500  or  a  CU 
instniction. 

CFCO  of  each  CU  specifies  the  array  configuration  in  which 
it  is  participating  by  means  of  a  1  in  the  appropriate  bits  of  CFCO. 
CFCI  specifies  the  instruction  addressing  to  be  used  within  the 
array.  In  a  united  configuration  it  is  thus  possible  for  the  instruc- 
tion stream  to  be  derived  from  any  subset  of  the  united  arrays. 
CFC2  specifies  the  CU  data  addressing  form  in  a  manner  similar 
to  the  CFCI  control  of  instruction  addressing. 


Chapter  27  |  The  ILLIAC  IV  computer 


The  addressing  indicated  by  both  CFCl  and  CFC2  must  be 
consistent  with  the  actual  configuration  designated  by  CFCO,  else 
a  configuration  interrupt  is  triggered. 

Trap  processing 

Because  external  demands  on  the  arrays  will  be  preprocessed 
through  the  B  6500  system  computer,  the  interrupt  system  for  the 
control  units  is  relatively  straightforward.  Interrupts  are  provided 
to  handle  B  6500  control  signals  and  a  variety  of  CU  or  array  faults 
(undefined  instructions,  instruction  parity  error,  improper  con- 
figuration control  instruction,  etc.).  .\rithnietic  overflow  and  under- 
flow in  anv  of  the  processing  elements  is  detected  and  produces  a 
trap. 

The  strategy  of  response  to  an  interrupt  is  an  efi^ective  FORK 
to  a  single-array  configuration.  Each  CU  saves  its  own  status  word 
automatically  and  independently  of  other  CU's  with  which  it  nia\- 
previously  have  been  configured. 

Hardware  implementation  consists  of  a  base  interrupt  address 
register  {BL\R)  which  is  dedicated  as  a  pointer  to  array  storage 
into  which  status  information  will  be  transferred.  Upon  receipt 
of  an  interrupt,  the  contents  of  the  program  counter  and  other 
status  information  and  the  contents  of  C.\RO  are  stored  in  the 
block  pointed  to  by  the  BIAR.  In  addition,  CAR  0  is  set  to  contain 
the  block  address  used  by  BIAR  so  that  subsequent  register  saving 
niav  be  programmed.  Interrupt  returns  are  accomplished  through 
a  special  instruction  which  reloads  the  previous  status  word  and 
CAR  0  and  clears  the  interrupt. 

Interrupts  are  enabled  through  a  mask  word  in  a  special  regis- 
ter. The  internipt  state  is  general  and  not  unique  to  a  specific 
trigger  or  trap.  During  the  interrupt  processing,  no  subsequent 
interrupts  are  responded  to,  although  their  presence  is  flagged  in 
the  interrupt  state  word. 

The  high  degree  of  overlap  in  the  control  unit  precludes  an 
immediate  response  to  an  interrupt  during  the  instruction  which 
generates  an  arithmetic  fault  in  some  processing  element.  To 
alleviate  this  it  is  possible  under  program  control  to  force  non- 
overlapped  instruction  execution  permitting  access  to  definite  fault 
information. 

Processing  element  (PE) 

The  processing  element,  shown  in  Fig.  6,  executes  the  data  com- 
putations and  local  indexing  for  operand  fetches.  It  contains  the 
following  elements. 

1    Four  64-bit  registers  (A,  B,  R,  S)  to  hold  operands  and  results. 
A  serves  as  the  accumulator,  B  as  the  operand  register,  R  as 


the  multiplicand  and  data  routing  register,  and  S  as  a  general 
storage  register. 

2  An  adder/ multiplier  (.VISG,  PAT,  CPA),  a  logic  unit  (LOG), 
and  a  barrel  switch  (BSW)  for  arithmetic.  Boolean,  and 
shifting  fimctions,  respectively. 

3  A  16-bit  index  register  (RGX)  and  adder  (.'\DA)  for  memory 
address  modification  and  control. 

■4  .\n  8-bif  mode  register  (RGM)  to  hold  the  results  of  tests 
and  the  PE  ENABLE  DISABLE  state  information. 

As  described  earlier,  the  PEs  may  be  partitioned  into  subproc- 
essors  of  word  lengths  of  64,  2  X  32,  or  S  x  8  bits.  Figure  7  shows 
the  data  representations  available.  Exponents  are  biased  and  rela- 
tive to  base  2,  Table  1  indicates  the  arithmetic  and  logical  opera- 
tions available  for  the  three  operand  precisions. 

PE  mode  control 

Two  t)its  of  the  mode  register  (RGM)  control  the  enabling  or 
disabling  of  all  instructions;  one  of  these  is  active  only  in  the  .32-bit 
precision  mode  and  controls  instruction  execution  on  the  second 
operand.  Two  other  bits  of  RGM  are  set  whenever  an  arithmetic 
fault  (overflow,  underflow)  occurs  in  the  PE.  The  fault  bits  of  all 
PEs  are  continuouslv  monitored  by  the  CU  to  detect  a  fault  condi- 
tion and  initiate  a  CU  trap. 

Data  paths 

Each  PE  has  a  64-bit  wide  routing  path  to  4  of  its  neighbors  (±1, 
±S).  To  minimize  the  phvsical  distances  involved  in  such  routing, 
the  PEs  are  grouped  8  to  a  cabinet  (PUC)  in  the  pattern  shown 
in  Fig.  8.  Routing  by  distance  ±8  occurs  interior  to  a  PUC;  routing 
bv  distance  ±1  requires  no  more  than  2  intercabinet  distances. 

CU  data  and  instruction  fetches  require  blocks  of  8  words, 
which  are  accessed  in  parallel,  1  word  per  PUC,  into  a  CU  buffer 
(CUB)  512-bit  wide,  distributed  among  the  PUCs,  1  word  per 


Table  1 

PE  data  operations 

Operation  tiitie  p 

'r  element 

Operation 

64  bit 

2  X  -32  bit 

H  X  Sbit 

+  .  - 

200  ns 

240  ns 

80  ns 

X 

400  ns 

400  ns 

2200  ns 

3040  ns 

Boolean 

80  ns 

Shift 

80  240  nst 

160  ns 

t  (Single  length)  (double  length) 


326  Part  4     The  Instruction-set  processor  level:  special-function  processors 


Section  2     Processors  for  array  data 


NEWS 


DRIVERS/ 
AND 
RECEIVERS 


CONTROL  UNIT 

i  L 


R  REGISTER 
(RGR) 


MULTIPLICAND 
SELECT 
GATES 
(MSG) 


PSEUDOADDER 
TREE 
(PAT) 


CARRY 
PROPAGATE 
ADDER 
(CPA) 


]  i  u 


S  REGISTER 
(RGS) 


A  REGISTER 
(RGA) 


LEADING 

ONES 
DETECTOR 
(LOD) 


DRIVERS 
AND 
RECEIVERS 


MODE 
REGISTER 
(RGM) 


OPERAND 
SELECT 
GATES 
(OSG) 


B  REGISTER 
(RGB) 


LOGIC 
UNIT 
(LOG) 


'  MIR 


BARREL 
SWITCH 
(BSW) 


ADDRESS 
ADDER 
(ADA) 


X  REGISTER 
(RGX) 


MEMORY 

ADDRESS 
REGISTERS 
(MAR) 


Fig.  6.  Processing-element  block  diagram. 


Chapter  27  |  The  ILLIAC  IV  computer  327 


E(15) 


F(48) 


64  BIT 


0 

1  7 

e 

9           IS  16                      39  40  63 

Si 

Ei(7) 

E2(7) 

F2(24) 

Fi(24) 

32  BIT 


0         7  B        15  16      23  24      31 32      39  40     47  46      55  56  63 


Bi 

B3 

B4 

B5 

B6 

B7 

B. 

8 

BIT 

S:  SIGN 

E:  EXPONENT 

F  ;  MANTISSA 


MA 


Fig.  8.  (a)  Electrical  connectivity  for  routing,  (b)  Physical  layout. 


Fig.  7.  ILLIAC  IV  data  representation. 


B6500 


INTERRUPT  CONTROL  16 


DESCRIPTORS 
16 


DATA  WORDS 
46 


cu. 


cu. 


cu. 


cu« 


IOC 


MEMORY 

REOUEST 


PEM, 


PEM, 


PEM, 


PEM< 


EU, 


BUFF  I 


lOS 


EUj 


BUFF, 


Fig.  9.  I/O  data  path. 


328  Part  4  |  The  instruction-set  processor  level:  special-function  processors 


Section  2  |  Processors  for  array  data 


CU 


PE 


PEM 


GPC -IOC  DISK  TEST 


GPC 


IOC 


DISK 


'  7  T  7 

CU 


PE 


PEM 


GPC 


IOC 


-rrm 
DISK 


PEM< 


•DISK  TEST 


rrTTTTi      rrTTT*  i/rrrii 

'J/^I        |.'/t/|  J 

GPC  —  IOC  — DISK 


PEM  TEST 


C 

77 

u 

GPC 
f .  /  J . 

PE 


7-7-7-7 

PEM 


'  '7  / 

-  IOC  - 


-  DISK 


CU  TEST 


T7-7-T1      rrr  7-7-1      rr  r  rrJi 

CU  —  PE  — ^PEM 
rrrrr 


7777- 

■  IOC  - 


GPC 


T7T77 

DISK 


PE  TEST 


I    I  TO  BE  TESTED 
rn   PARTIALLY  TESTED 
□  TESTED 


cabinet.  Data  is  transmitted  to  the  CU  from  the  CUB  on  a  512-line 
bus. 

Disk  and  on-line  I/O  data  are  transmitted  on  a  1024-line  bus 
which  can  be  switched  among  the  arrays.  Within  each  array, 
parallel  connection  is  made  to  a  selected  16  of  64  PEs,  2  per  PUC. 
Maximum  data  rate  is  one  I/O  transaction  per  microsecond  or  lO** 
bits  per  second.  The  I/O  path  of  1024  lines  is  expandable  to  4096 
lines  if  required. 

Processing  element  memory  (PEM) 

The  individual  memory  attached  to  each  processing  element  is 
a  thin-film  DRO  linear  select  memory  with  a  cycle  time  of  240 
ns  and  access  time  of  120  ns.  Each  has  a  capacity  of  2048  64-bit 
words.  The  memory  is  independently  accessible  by  its  attached 
PE,  the  CU,  or  I/O  connections. 

Disk-file  subsystem 

The  computing  speed  and  memory  of  the  ILLIAC  IV  arrays  re- 
quire a  substantial  secondary  storage  for  program  and  data  files 
as  well  as  backup  memory  for  programs  whose  data  sets  exceed 
fast  memory  capacity.  The  disk-file  subsystem  consists  of  six  Bur- 
roughs model  IIA  storage  units,  each  with  a  capacity  of  1.61  x  10* 
bits  and  a  maximum  latency  of  40  ms.  The  system  is  dual;  each 
half  has  a  capacity  of  5  X  10*  bits  and  independent  electronics 
capable  of  supporting  a  transfer  rate  of  500  megabits  per  second. 
The  data  path  from  each  of  the  disk  subsystems  becomes  1024 
bits  wide  at  its  interface  with  the  array.  Figure  9  shows  the 
organization  of  the  disk-file  system. 

B  6500  control  computer 

The  B  6500  computer  is  assigned  the  following  functions. 

1  Executive  control  of  the  execution  of  array  programs 

2  Control  of  the  multiple-array  configuration  operations 

3  Supervision  of  the  internal  I/O  processes  (disk  to  arrays, 
etc.) 

4  External  I/O  processing  and  supervision 

5  Processing  and  supervision  of  the  files  on  the  disk  file  sub- 
system 

6  Independent  data  processing,  including  compilation  of 
ILLIAC  IV  programs 

To  control  the  array  operations,  there  is  a  single  interrupt  line 
and  a  16-bit  data  path  both  ways  between  the  B  6500  and  each 
of  the  control  units.  In  addition,  the  B  6500  has  a  control  and  data 


Fig.  10.  System  diagnostic  sequence. 

path  to  the  I/O  controller  (IOC)  which  supervises  the  disk,  and 
also  direct  connections  to  the  array  memories. 

Reliability  and  maintenance  of  the  ILLIAC  IV 

The  progress  in  computer  components  from  vacuum  tubes  to  semi- 
conductors over  several  generations  has  improved  the  mean-time- 
between-failures  for  computers  from  tens  of  hours  to  several  thou- 
sand hours.  By  using  larger  scale  integration,  a  tenfold  increase 


Chapter  27  |  The  ILLIAC  IV  computer  329 


in  number  of  gates  per  system  should  be  possible  with  comparable 
reliability. 

It  is  only  by  virtue  of  high-density  integration  (50-  to  lOO-gate 
package)  that  the  design  of  a  three-million-gate  system  can  be 
contemplated.  Reliability  of  the  major  part  of  the  system,  256 
processing  elements  and  256  memory  units,  is  expected  to  be  in 
the  range  of  10'^  hours  per  element  and  2  X  10^  hours  per  menior\ 
unit. 

The  organization  of  the  ILLLAC  IV  as  a  collection  of  identical 
units  simplifies  its  maintenance  problems.  The  processing  ele- 
ments, the  memories,  and  some  part  of  power  supplies  are  designed 
to  be  pluggable  and  replaceable  to  reduce  system  down  time  and 
improve  system  availability. 


The  remaining  problems  are  (1)  location  of  the  faulty  subsys- 
tem, and  (2)  location  of  the  faulty  package  in  the  subsystem. 

Location  of  the  faulty  subsystem  assumes  the  B  65(){)  to  be 
fault-free,  since  this  can  be  determined  by  using  the  standard 
B  65(H)  maintenance  routines.  The  steps  to  follow  are  shown  in 
Fig.  10. 

The  B  6500  tests  the  control  units  (CU)  which  in  turn  test  all 
PEs.  PEMs  are  tested  through  the  disk  channel.  This  capability 
for  hmctional  partitioning  of  the  subsystems  simplifies  the  diag- 
nostic procedure  considerably. 

References 

H()UJ59;  KuckmH;  MiirtJ66;  SlotD62;  UngeS58 


330  Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  2  |  Processors  for  array  data 


APPENDIX  1 

Al.  CLASSIFIED  LIST  OF  CU  INSTRUCTIONS 


EQLX| 


,fT,  A 
F 


Al.l  Data  transmission 


ALIT 

BIN 

BINX 

BOUT 

BOUTX 

CLC 

COPY 

DUPI 

DUPO 

EXCHL 

LDL 

LIT 

LOAD 

LOADX 


ORAC 

SLIT 

STL 

STORE 

STOREX 

TCCW 

TCW 


A  1.2  Skip  and  test 


CTSB 


4  Instructions: 
fT,  A-l 

eql[p  ) 


4  Instructions: 


Add  literal  (24  bit)  to  CAR. 
Block  fetch  to  CU  memory. 
Indexed  (by  PE  index)  block  fetch. 
Block  store  from  CU  memory. 
Indexed  block  store. 
Clear  CAR. 

Copy  CAR  into  CAR  of  other  quadrant. 
Duplicate  inner  half  of  CU  memory  ad- 
dress contents  into  both  halves  of  CAR. 
Duplicate  outer  half  of  CU  memory  ad- 
dress contents  into  both  halves  of  CAR. 
Exchange  contents  of  CAR  with  CU  mem- 
ory address  contents. 

Load  CAR  from  CU  memory  address  con- 
tents. 

Load  CAR  with  64-bit  literal  following  the 
instruction. 

Load  CU  memory  from  contents  of  PE 
memory  address  found  in  CAR. 
Load  CU  memory  from  contents  of  PE 
memory  address  found  in  CAR,  indexed 
by  PE  index. 

OR  all  CARS  in  array  and  place  in  CAR. 

Load  CAR  with  24-bit  Uteral. 

Store  CAR  into  CU  memory. 

Store  CAR  into  PE  memory. 

Store  CAR  into  PE  memory,  indexed  by 

PE  index. 

Transmit  CAR  counterclockwise  between 
CUs  in  array- 
Transmit  CAR  clockwise  between  CUs  in 
array. 


Skip  on  nth  bit  of  CAR.  If  Tis  present,  skip 
if  1;  if  F  is  present,  skip  if  0.  If  A  is  pres- 
ent, AND  together  bits  from  all  CUs  in 
array  before  testing:  if  absent,  OR  together 
bits  from  all  CUs  in  array  before  testing. 
CTSBT,  CTSBTA,  CTSBF,  CTSBFA. 
Skip  on  CAR  equal  to  CU  memory  ad- 
dress contents.  The  letters  T,  F,  and  A 
have  the  same  meaning  as  in  CTSB  above. 
EQLT,  EQLTA,  EQLF,  EQLFA. 


4  Instructions: 
fT,  Al 


GRTR 


4  Instructions: 
LESsf 


4  Instructions: 
fT,  Al 


ONES 


4  Instructions: 


-fT.  A 


ONEX 


4  Instructions: 
fT,  A1 


SKIP 


{11 


4  Instructions: 
SKIP 

fT,  A,  n 


TXL 


8  Instnictions: 


Skip  on  index  portion  of  CAR  (bits  40 
through  63)  equal  to  bits  40  through  63  of 
CU  memory  address  contents.  The  letters 
r,  F,  and  A  have  the  same  meaning  as  in 
CTSB  above. 

EQLXT,  EQLXTA,  EQLXF,  EQLXFA. 
Skip  on  index  part  of  CAR  (bits  40  through 
63)  greater  than  bits  40  through  63  of  CU 
memory  address  contents.  The  letters  T, 
F,  and  A  have  the  same  meaning  as  in 
CTSB  above. 

GRTRT,  GRTRTA,  GRTRF,  GRTRFA. 
Skip  on  index  part  of  CAR  (bits  40  through 
63)  less  than  bits  40  through  63  of  CU 
memory  address  contents.  The  letters  T,  F, 
and  A  have  the  same  meaning  as  in  CTSB 
above. 

LESST,  LESSTA,  LESSF,  LESSFA. 
Skip  on  CAR  equal  to  all  I  s.  The  letters 
T,  F,  and  A  have  the  same  meaning  as  in 
CTSB  above. 

ONEST,  ONESTA,  ONESF,  ONESFA. 
Skip  on  bits  40  through  63  of  CAR  equal 
to  all  I  s.  The  letters  T,  F.  and  A  have  the 
same  meaning  as  in  CTSB  above. 
ONEXT,  ONEXTA,  ONEXF,  ONEXFA. 
Skip  on  T-F  flip-flop  previously  set.  The 
letters  T,  F,  and  A  have  the  same  meaning 
as  in  CTSB  above. 
SKIPT,  SKIPTA,  SKIPF,  SKIPFA. 
Sldp  unconditionally. 

Skip  on  index  portion  of  CAR  (bits  40 
through  63)  less  than  limit  portion  (bits  1 
through  1.5).  The  letters  T,  F,  and  A  have 
the  same  meaning  as  in  CTSB  above.  If  / 
is  present,  the  index  portion  of  CAR  is  in- 
cremented by  the  increment  portion  of 
CAR  (bits  16  through  39)  while  the  test  is 
in  progress;  if  /  is  not  present,  no  incre- 
menting takes  place. 

TXLT,  TXLTI,  TXLTA,  TXLTAI,  TXLF, 
TXLFI,  TXLFA,  TXLFAI. 
Skip  on  index  portion  of  CAR  (bits  40 
through  63)  equal  to  Umit  portion  of  CAR 
(bits  1  through  15).  See  CTSB  for  the 
meaning  of  T,  F,  and  A;  see  TXL  above 
for  the  meaning  of  /. 


Chapter  27  j  The  ILLIAC  IV  computer  331 


8  Instructions: 
fT,  A,  I] 


TXG 


8  Instructions: 
fT,  A1 


ZER 


ZERX 


I 


4  Instnictions: 
(T,  A 
IF 


4  Instructions: 


TXET,  TXETI,  TXETA,  TXETIA,  TXEF, 
TXEFI,  TXEFA,  TXERA. 
Skip  on  index  portion  of  CAR  (bits  40 
through  63)  greater  than  hmit  portion  of 
CAR  (bits  1  through  15).  See  CTSB  for  the 
meaning  of  T,  F,  and  A;  see  TXL  above 
for  the  meaning  of  /. 

TXGT,  TXGTI,  TXGTA,  TXGTAI,  TXGF, 

TXGFI,  TXGFA,  TXGFAI. 

Skip  on  CAR  all  O's.  See  CTSB  for  the 

meaning  of  T,  F,  and  A. 

ZERT,  ZERTA,  ZERF,  ZERFA. 

Skip  on  index  portion  of  CAR  (bits  40 

through  63)  all  O's.  See  CTSB  for  the 

meaning  of  T,  F.  and  A. 

ZERXT,  ZERXTA,  ZERXF,  ZERXFA. 


A1.3  Transfer  of  control 


EXEC 

EXCHL 

HALT 
JUMP 
LOAD 


LOADX 

STL 

AI.4  Route 
RTE 


A1.5  Arithmetic 

ALIT 
CADD 


CSUB 

INCRXC 

A1.6  Logical 

CAND 

CCB 

CEXOR 


Execute  instniction  found  in  bits  32  through 
63  of  CAR. 

Exchange  contents  of  CAR  with  contents 
of  CU  memory  address. 
Halt  ILLIAC  IV. 

Jump  to  address  found  in  instruction. 
Load  CU  memory  address  contents  from 
contents  of  PE  memory  address  found  in 
CAR. 

Load  CU  memory  address  contents  from 
contents  of  PE  memory  address  found  in 
CAR,  indexed  by  PE  index. 
Store  CAR  into  CU  memory. 


Route.  Routing  distance  is  found  in  address 
field  (CAR  inde.xable),  and  register  con- 
nectivity is  found  in  the  skip  field. 


Add  24-bit  Uteral  to  CAR. 

Add  contents  of  C\J  memory  address  to 

CAR. 

Subtract  contents  of  CU  memory  address 
from  CAR. 

Increment  index  word  in  CAR. 


AND  CU  memory  to  CAR. 
Complement  bit  of  CAR. 
Exclusive  OR  CU  memory  to  CAR. 


CLC  Clear  CAR. 

COR  OR  CU  memory  to  CAR. 

CRB  Reset  bit  of  CAR. 

CROTL  Rotate  CAR  left. 

CROTR  Rotate  CAR  right. 

CSB  Set  bit  of  CAR. 

CSHL  Shift  CAR  left. 

CSHR  Shift  CAR  right. 

LEADO  Detect  leading  ONE  in  CAR  of  all  quad- 
rants in  array. 

LEADZ  Detect  leading  ZERO  in  CAR  of  all  quad- 
rants in  arrav. 

ORAC  OR  all  CARS  in  arrav  and  place  in  CAR. 

A2.  CLASSIFIED  LIST  OF  PE  INSTRUCTIONS 
A2.1  Data  transmission 


LDA 

Load  A  register. 

LDB 

Load  B  register 

LDR 

Load  R  register. 

LDS 

Load  S  register. 

LDX 

Load  -V  register. 

LDCO 

Load  C.\R  0  from  PE  register. 

LDCl 

Load  C.\R  1  from  PE  register. 

LDC2 

Load  C.\R  2  from  PE  register. 

LDC3 

Load  C.\R  3  from  PE  register. 

LEX 

Load  e.xponent  of  A  register. 

ONES 

Load  all  ONES  into  A  register. 

STA 

Store  A  register. 

STB 

Store  B  register. 

STC 

Store  C  register. 

STR 

Store  R  register. 

STS 

Store  S  register. 

STX 

Store  A'  register. 

SWAPA 

Interchange  inner  and  outer  cc 

SWAP 


SWAPX 


A2.2  Index  operations 
IX  E.  I 

Ig  J 


register. 

Interchange  the  contents  of  A  register  and 
B  register. 

Interchange  outer  operand  of  A  register 
and  inner  operand  of  B. 

Set  /  on  comparison  of  X  register  and  op- 
erand. The  presence  of  L  means  set  /  if 
A'  is  less  than  operand;  the  presence  of  £ 
means  set  /  if  X  is  equal  to  operand;  the 
presence  of  G  means  set  /  if  X  is  greater 
than  operand.  If  /  is  present,  increment  X 
while  performing  test;  if  Z  is  absent,  do  not 
increment  X. 


332  Part  4  |  The  instruction-set  processor  level:  special-function  processors  Section  2  |  Processors  for  array  data 


6  Instructions:        IXL,  IXLI,  IXE,  IXEI,  IXG,  IXGI. 

{L    ]  Set  /  on  comparison  of  A'  register  and  op- 

E,  I>  erand.  See  above  for  meaning  of  L,  E,  G, 

G   I  and  /. 

6  Instructions:        JXL,  JXLI,  JXE,  JXEI,  JXG,  JXGI. 

XI  Increment  PE  index  (X  register)  by  bits  48 

through  63  of  operand. 
XIO  Increment  PE  index  of  bits  48  through  63 

of  operand  phis  one. 

A2.3  Mode  setting/ comparisons 

EQB  Test  A  and  B  for  equality  bytewise. 

GRB  Test  B  register  greater  than  A  register 

bytewise. 

Test  B  register  less  than  A  register  bytewise. 
Change  word  size. 

Set  /  if  A  register  is  less  than  operand.  L 
means  test  logical;  A  means  test  arithmetic; 
M  means  test  mantissa. 
ILL,  lAL,  IML. 

Set  /  if  A  register  is  equal  to  operand.  See 
above  for  meaning  of  L,  A,  and  M. 


LSB 
CHWS 

I  A  L 

ImJ 

3  Instructions: 
fLl 
I  A  E 

Im) 

3  Instructions: 
I  AG 

ImJ 

3  Instructions: 
I  A  Z 

ImJ 

3  Instructions: 
I  L| 
I  A  p 

ImJ 

3  Instructions: 
fL  L] 
J  A,  E 
IM  gJ 

z 

o 

15  Instructions: 


l"^  1 
IX  E,  I 

IG  J 


ILE,  lAE,  IME. 

Set  /  if  A  register  is  greater  than  operand. 
See  above  for  meaning  of  L,  A,  and  M. 

ILG,  lAG,  IMG. 

Set  /  if  A  register  is  equal  to  all  zeros. 


ILZ,  lAZ,  IMZ. 

Set  /  if  A  register  is  equal  to  all  ONES. 


ILO,  lAO,  IMO. 

Set  /  under  conditions  specified  in  set  of 
instructions  immediately  above. 


JLL,  JAL,  JML,  JLE,  JAE,  JME,  JLG, 
JAG,  JMG,  JLZ,  JAZ,  JMZ,  JLO,  JAO, 
JMO. 

Set  /  on  comparison  of  X  register  and  op- 
erand. See  Section  A2.2  for  meaning  of  L, 
£,  G,  and  /. 


6  Instructions:        IXL,  LXLI,  IXE,  IXEI,  IXG,  IXGI. 

|L    j  Set  /  on  comparison  of  X  register  and  op- 

JX^E,  I>  erand.  See  Section  A2.2  for  meaning  of  L, 

I G   J  £,  G,  and  7. 

6  Instructions:        JXL,  JXLI,  JXE,  JXEI,  JXG,  JXGI. 

|L|  Set  /  on  comparison  of  S  register  and  op- 

IS^E  }  erand.  See  Section  A2.2  for  meaning  of  L, 

IgJ  £,  and  G. 

3  Instructions:        ISL,  ISE,  ISG. 

{LI  Set  /  on  comparison  of  S  register  and  op- 

E  \  erand.  See  Section  A2.2  for  meaning  of  L, 

G>  E,  and  G. 

3  Instructions:        JSL,  JSE,  JSG. 

ISN  Set  /  from  the  sign  bit  of  A  register. 

JSN  Set  /  from  the  sign  bit  of  A  register. 

SETE  Set  E  bit  as  a  logical  function  of  other  bits. 

SETEO  Set  £1  bit  similarly. 

SETF  Set  F  bit  similarly. 

SETFO  Set  Fl  bit  similarly. 

SETG  Set  G  bit  similarly. 

SETH  Set  H  bit  similarly. 

SETI  Set  /  bit  similarly. 

SETJ  Set  /  bit  similarly. 

SETCO  Set  Fth  bit  of  CAR  0  similarly. 

SETCl  Set  Pth  bit  of  CAR  1  similarly. 

SETC2  Set  Pth  bit  of  CAR  2  similarly. 

SETC3  Set  Pth  bit  of  CAR  3  similarly. 

IBA  Set  /  from  Nth  bit  of  A  register;  bit  num- 

ber is  found  in  address  field. 
JBA  Set  /  from  Mh  bit  of  A  register;  bit  num- 

ber is  found  in  address  field. 

A2.4  Arithmetic 

ADB  Add  bytewise. 

SBB  Subtract  operand  from  A  register  bytewise. 

ADD  Add  A  register  and  operand  as  64-bit 

operands. 

SUB  Subtract  operand  from  A  register  as  64- 

bit  quantities. 

AD{R,  N,  M,  S}  Add  operand  to  A  register.  The  R,  N,  M, 
S  specify  all  possible  variants  of  the  arith- 
metic instruction.  The  meaning  of  each 
letter,  if  present  in  the  mnemonic,  is 

R        round  result 

N       normalize  result 

M       mantissa  only 

S        special  treatment  of  signs. 


Chapter  27  |  The  ILLIAC  IV  computer  333 


16  Instructions;       ADM,  ADMS,  ADNM,  ADNMS,  ADN, 
ADNS,      ADRM,      ADRMS,  ADRM, 
ADRNMS,  ADRN,  ADRNS,  ADR,  ADRS, 
AD,  ADS. 
ADEX  Add  to  exponent. 

DV{R,  N,  M,  S}  Divide  by  operand.  See  AD  instniction  for 
meaning  of  R,  N,  Af,  and  S. 
16  Instructions:  DVM,  DVMS,  DVNM,  DVNMS,  DVN, 
DVNS,  DVRM,  DVRMS,  DVRNM, 
DVRNS,  DVRN,  DVRNS,  DVR,  DVRS, 
DV,  DVS. 

EAD  Extend  precision  after  floating  point  .\DD. 

ESB  Extend  precision  after  floating  point  SUB- 

TRACT. 

LEX  Load  exponent  of  A  register. 

ML{R,  N,  M,  S)  Multiply  by  operand.  See  .\D  instruction 
for  meaning  of  fi,  N,  M,  and  S. 
16  Instnictions:  MLM,  MLMS,  MLNM,  MLNMS,  MLN, 
MENS,  MLRM,  MLRMS,  MLRNM, 
MLRNMS,  MLRN,  MLRNS,  MLR,  MLRS, 
ML,  MLS. 

SAN  Set  A  register  negative. 

SAP  Set  A  register  positive. 

SBEX  Subtract  exponent  of  operand  from  expo- 

nent of  A  register. 

SB{R,  N,  M,  S}   Subtract  operand  from  A  register.  See  .\D 
instruction  for  meaning  of  fi,  .V,  .\/,  and  S. 
16  Instructions:        SBM,  SBMS,  SBNM,  SBNMS,  SBN,  SBNS, 
SBRM,  SBRMS,  SBRNM,  SBRNMS,  SBRN, 
SBRNS,  SBR,  SB,  SBS. 

NORM  Normalize  A  register. 

MULT  In  32-bit  mode,  perform  MULTIPLY  and 

leave  outer  result  in  A  register  and  inner 
result  in  B  register,  with  both  results  ex- 
tended to  64-bit  format. 


16  Instructions: 


CBA 
CHSA 

fN|  fN] 
Z    EOR  Z 

lo'  lol 

16  Instructions: 


LEX 

(N| 

Z  OR 

lol 

lol 

16  Instructions: 


RBA 

RTAL 

RTAML 

RTAMR 

RTAR 

SAN 

S.\P 

SBA 

SHABL 

SHABR 

SH.AL 

SHAML 

SHAR 

SHAMR 


AND,  ANDN,  ANDZ,  ANDO,  NAND, 
NANDN,    NANDZ,    NANDO,  ZAND, 
ZANDN,    Z.\NDZ,    ZANDO,  OAND, 
OANDN,  OANDZ,  OANDO. 
Complement  bit  of  A  register. 
Change  sign  of  A  register. 

Exclusive  OR  A  register  with  operand. 

EOR,    EORN,   EORZ,    EORO,  NEOR, 
NEORN,    NEORZ,    NEORO,  ZEOR, 
ZEORN,    ZEORZ,    ZEORO,  OEOR, 
OEORN,  OEORZ,  OEORO. 
Load  exponent  of  A  register. 

OR  A  register  with  operand. 

OR,  ORN,  ORZ,  ORO,  NOR,  NORN, 

NORZ,   NORO,   ZOR,   ZORN,  ZORZ, 

ZORO  OOR,  OORN,  OORZ,  OORO. 

Reset  bit  A  register  to  ZERO. 

Rotate  A  register  left. 

Rotate  mantissa  of  A  register  left . 

Rotate  mantissa  of  A  register  right. 

Rotate  A  register  right. 

Set  A  register  negative. 

Set  A  register  positive. 

Set  bit  of  A  register  to  ONE. 

Shift  A  and  B  registers  double-length  left. 

Shift  A  and  B  registers  double-length  right. 

Shift  A  register  left. 

Shift  A  register  mantissa  left. 

Shift  A  register  right. 

Shift  A  register  mantissa  right. 


A2.5  Logical 


|N| 

Z  AND 

Izl 

loJ 

lol 

AND  A  register  with  operand.  The  left- 
hand  set  of  letters  specifies  a  variant  on 
the  A  register,  the  right-hand  set,  on  the 
operand.  The  meaning  of  these  variants  is 
not  present  use  true 
N  use  complement 

Z  use  all  ZEROS 

O  use  all  ONES. 


Section  3 

Processors  defined  by  a  microprogram 


Processors  defined  by  a  microprogram  have  only  recently  come 
Into  existence,  although  Wilkes  suggested  the  idea  in  1951.  The 
discussion  in  Chap.  3  (page  71)  suggests  reasons  why  this 
controversial  idea  has  taken  so  long  to  be  adopted. 

Microprogramming  and  the  design  of  the  control  circuits 
in  an  electronic  computer 

Chapter  28  is  an  extension  of  an  earlier  paper  by  Wilkes.  It 
includes  an  example  of  a  microprogrammed  processor  (page 
337).  In  the  earlier  paper,  The  Best  Way  to  Design  an  Automatic 
Computing  Machine  [Wilkes,  1951a],  the  essential  ideas  of 
microprogramming  were  first  outlined. 

The  observation  that  an  instruction  set,  or  ISP,  should  be 
looked  at  as  a  program  to  be  interpreted  is  the  basis  of  micro- 
programming. The  idea  of  an  ISP  is  our  acknowledgment  that 
we,  too,  view  a  processor  as  a  program. 

There  is  little  to  say  about  this  chapter;  it  is  historical,  yet 
timely  and  well  written.  Microprogramming,  like  other  of  Wilkes' 
ideas,  is  present  in  many  of  our  computers. 

The  design  of  a  general-purpose  microprogram- 
controlled  computer  with  elementary  structure 

The  SD-2  computer  (Chap.  29)  is  described  by  Kampe  in  a 
casual  but  highly  communicative  fashion.  Most  engineers  tend 
to  be  somewhat  formal  and  stuffy  when  describing  the  ma- 


chines they  have  designed.  This  formal  ruse  can  be  used  to 
make  the  design  seem  difficult  but  well  founded— certainly  not 
arbitrary.  Kampe  truthfully  admits  to  making  decisions  in  a 
somewhat  arbitrary  fashion. 

The  SD-2  microprogram  structure,  unlike  that  of  the  IBM  Sys- 
tem 360  models,  has  a  P.microprogram  which  is  similar  to  the 
external  Pc  which  it  defines.  As  such,  the  main  question  about 
this  design  is  whether  it  is  cheaper  to  have  a  single,  hard- 
wired Pc  rather  than  a  computer  within  a  computer.  The 
Packard  Bell  440  [Boutwell  and  Hoskinson,  1963]  is  an  example 
of  a  better-known  Pc  whose  internal  P  resembles  the  SD-2. 

The  authors  of  this  book  feel  that,  when  the  internal  and 
external  P's  are  so  similar,  it  may  be  better  to  have  a  single 
P  which  suits  both  needs.  To  gain  speed  and  still  define  powerful 
functions,  Mp  could  be  made  up  of  both  the  conventional  Mp 
and  a  small,  fast  Mp. 

The  Hewlett-Packard  HP  9100A  computing  calculator 

The  HP  9100A  (Chap.  20)  is  discussed  in  Part  3,  Sec.  4,  page 
235. 

Microprogrammed  implementation  of  EULER 
on  the  IBM  System  360/Model  30 

This  microprogrammed  processor  in  Chap.  32  is  also  discussed 
as  a  language  processor  in  Part  4,  Sec.  4,  page  348. 


Chapter  28 


Microprogramming  and  the  design 
of  the  control  circuits  in  an  electronic 
digital  computer^ 

M.  V.  Wilkes  /  }.  B.  Stringer 
1.  Introduction 

Experience  has  shown  that  the  sections  of  an  electronic  digital 
computer  which  are  easiest  to  maintain  are  those  which  have  a 
simple  logical  structure.  Not  only  can  this  structure  be  readily 
borne  in  mind  by  a  maintenance  engineer  when  looking  for  a  fault, 
but  it  makes  it  possible  to  use  fault-locating  programmes  and  to 
test  the  equipment  without  the  use  of  elaborate  test  gear.  It  is 
in  the  control  section  of  electronic  computers  that  the  greatest 
degree  of  complexity  generally  arises.  This  is  particularly  so  if  the 
machine  has  a  comprehensive  order  code  designed  to  make  it 
simple  and  fast  in  operation.  In  general,  for  each  different  order 
in  the  code  some  special  equipment  must  be  provided,  and  the 
more  complicated  the  function  of  the  order  the  more  complex  this 
equipment.  In  the  past,  fear  of  complicating  unduly  the  control 
circuits  of  the  machines  has  prevented  the  designers  of  electronic 
machines  from  providing  such  facilities  as  orders  for  floating-point 
operations,  although  experience  with  relay  machines  and  with 
interpretive  subroutines  has  shown  how  valuable  such  orders  are. 
This  paper  describes  a  method  of  designing  the  control  circuits 
of  a  machine  which  is  wholly  logical  and  which  enables  alterations 
or  additions  to  the  order  code  to  be  made  without  ad  hoc  altera- 
tions to  the  circuits.  An  outline  of  this  method  was  given  by  one 
of  us  [Wilkes,  1951fl]  at  the  Conference  on  Automatic  Calculat- 
ing Machines  at  the  University  of  Manchester  in  July  1951. 

The  operation  called  for  bv  a  single  machine  order  can  be 
broken  down  into  a  sequence  of  more  elementary  operations;  for 
example,  shifting  a  number  in  the  accumulator  one  place  to  the 
right  may  involve,  first,  a  transfer  of  the  number  to  an  au.\iliary 
shifting  register,  and  secondly,  the  transfer  of  the  number  back 
to  the  accumulator  along  an  oblique  path.  These  elementary 
operations  will  be  referred  to  as  micro-operatiom.  Basic  machine 
operations,  such  as  addition,  subtraction,  multiplication,  etc.,  are 
thought  of  as  being  made  up  of  a  micro-programme  of  micro- 

iPiof.  Cambridge  Phil.  Soc.  pt.  2.  vol.  49,  pp.  230-238,  April.  19.5.3. 


operations,  each  micro-operation  being  called  for  by  a  micro-order. 
The  process  of  writing  a  micro-programme  for  a  machine  order 
is  very  similar  to  that  of  writing  a  programme  for  the  whole 
calculation  in  terms  of  machine  orders. 

For  the  method  to  be  applicable  it  is  necessary  that  the 
machine  should  contain  a  suitable  permanent  rapid-access  storage 
device  in  which  the  micro-programme  can  be  held — a  diode  matrix 
is  proposed  in  the  ca.se  of  the  machine  discussed  as  an  example 
below — and  that  means  should  be  provided  for  executing  the 
micro-orders  one  after  the  other.  It  is  also  necessary  that  provision 
.should  be  made  for  conditional  micro-orders  which  plav  a  role 
in  micro-programming  similar  to  that  played  bv  conditional  orders 
in  ordinary  programming. 

Since  the  only  feature  of  the  machine  which  has  to  be  designed 
specially  for  any  particular  set  of  machine  orders  is  the  configura- 
tion of  diodes  in  the  matrix,  or  the  corresponding  configuration 
in  whatever  equivalent  device  is  used,  there  is  no  difficulty  in 
making  changes  to  the  order  code  of  the  machine  if  experience 
shows  them  to  be  desirable;  in  fact,  the  design  of  the  machine 
in  the  first  place  can  be  carried  out  completely  without  a  firm 
decision  on  the  details  of  the  order  code  being  taken,  as  long  as 
care  is  taken  to  provide  accommodation  for  the  greatest  number 
of  micro-orders  that  are  likely  to  be  required.  It  would  even  be 
possible  to  have  a  number  of  interchangeable  matrices  providing 
for  different  order  codes,  so  that  the  user  could  choose  the  one 
most  suited  to  his  particular  requirements. 

2.    Description  of  the  proposed  system 

The  system  will  be  described  in  relation  to  a  parallel  machine 
having  an  arithmetical  unit  designed  along  conventional  lines.  This 
will  contain  a  set  of  registers  and  an  adder  together  with  a  switch- 
ing system  which  enables  the  micro-operations  in  the  various 
machine  orders  to  be  performed.  Some  of  the  micro-operations 
will  be  simple  transfers  of  a  number  from  one  register  to  another 
with  or  without  shifting  of  the  number  one  place  to  the  left  or 


335 


336  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  3  |  Processors  defined  by  a  microprogram 


the  right,  while  others  will  also  involve  the  use  of  the  adder.  Any 
particular  micro-operation  can  be  performed  by  applying  pulses 
simultaneously  to  the  appropriate  gates  of  the  switching  system. 
In  certain  cases  it  may  be  possible  for  two  or  more  micro-opera- 
tions to  take  place  at  the  same  time. 

It  will  be  convenient  to  regard  the  control  system  as  consisting 
of  two  parts.  A  register  is  needed  to  hold  the  address  of  the  next 
order  due  to  be  executed,  and  another  to  hold  the  current  order 
while  it  is  being  executed,  or  at  any  rate  during  part  of  that  time. 
Some  means  of  counting  the  number  of  steps  in  a  shifting  operation 
or  a  multiplication  must  also  be  provided.  One  method  of  meeting 
these  requirements  is  to  provide  a  group  of  registers  and  an  adder 
together  with  a  switching  system  which  enables  transfers  of  num- 
bers, with  or  without  addition,  to  be  made.  This  part  of  the  control 
system  will  be  called  the  control  register  unit.  In  any  case  the 
operations  which  need  to  be  performed  on  the  numbers  standing 
in  the  control  register  unit  during  the  execution  of  an  order  are, 
like  the  operations  performed  in  the  arithmetical  unit,  regarded 
as  being  made  up  of  a  sequence  of  micro-operations,  each  of  which 
is  performed  by  the  application  of  pulses  to  appropriate  gates. 

The  other  part  of  the  control  system  is  concerned  with  control 
of  the  sequence  of  micro-orders  required  to  carry  out  each  machine 
order,  and  with  the  operation  of  the  gates  required  for  the  execu- 
tion of  each  micro-order.  This  will  be  called  the  micro-control  unit; 
it  consists  of  a  decoding  tree,  two  rectifier  matrices  and  two  regis- 
ters (additional  to  those  of  the  control  register  unit)  connected 
as  indicated  in  Fig.  1,  which  shows  how  the  pulses  used  to  operate 
the  gates  in  the  arithmetical  unit  and  control  register  unit  are 
generated.  A  series  of  control  pulses  from  a  pulse  generator  are 
applied  to  the  input  of  the  decoding  tree.  Each  pulse  is  routed 
to  one  of  the  output  lines  of  the  tree,  according  to  the  number 
standing  in  register  I.  The  output  lines  all  pass  into  a  rectifier 
matrix  A  and  the  outputs  of  this  matrix  are  the  pulses  which 
operate  the  various  gates  associated  with  micro-operations.  Thus 
one  input  line  of  the  matrix  corresponds  to  one  micro-order.  The 
address  of  the  micro-order  is  the  number  which  must  be  placed 
in  register  I  to  cause  the  control  pulse  to  be  routed  to  the  corre- 
sponding line.  The  output  lines  from  the  tree  also  pass  into  a 
second  matrix  B,  which  has  its  outputs  connected  to  register  II. 
This  matrix  has  wired  on  it  the  addiess  of  the  micro-order  to  be 
performed  next  in  time  so  that  the  address  of  this  micro-order  is 
placed  in  register  II.  Just  before  the  next  control  pulse  is  applied 
to  the  input  of  the  tree  a  connexion  is  established  between  register 
II  and  register  I,  and  the  address  of  the  micro-order  due  to  be 
executed  next  is  transferred  into  register  I,  In  this  way  the  de- 
coding tree  is  prepared  to  route  the  next  incoming  control  pulse 


From 
order 


To  arithmetical  From 
unit,  control  conditional 
registers,  etc.  Aip-flop 


Fig.  1.  Micro-control  unit. 

to  the  correct  output  line.  Thus  application  of  pulses  alternately 
to  the  input  of  the  tree  and  to  the  gate  connecting  registers  I  and 
II  causes  a  predetermined  sequence  of  micro-orders  to  be  executed. 

It  is  necessary  to  have  means  whereby  the  course  of  the  micro- 
programme  can  be  made  conditional  on  whether  a  given  digit  in 
one  of  the  registers  of  the  arithmetical  unit  or  control  register  unit 
is  a  1  or  a  0.  The  means  of  doing  this  is  shown  at  A'  in  Fig.  1. 
A  two-way  switch,  controlled  by  a  special  flip-flop  called  a  condi- 
tional flip-flop,  is  inserted  between  matrix  \  and  matrix  B.  The 
conditional  flip-flop  can  be  set  by  an  earlier  micro-order  with  any 
digit  from  any  one  of  the  registers.  Two  separate  addresses  are 
wired  into  matrix  B,  and  the  one  which  passes  into  register  I,  and 
thus  becomes  the  address  of  the  next  micro-order,  is  determined 
by  the  setting  of  the  conditional  flip-flop. 

Conditional  micro-orders  play  the  same  part  in  the  construction 
of  micro-programmes  as  conditional  orders  play  in  the  construction 
of  ordinary  programmes;  apart  from  their  obvious  uses  in  micro- 
programmes  for  such  operations  as  multiplication  and  division, 
they  enable  repetitive  loops  of  micro-orders  to  be  used. 

If  desired,  two  branchings  may  be  inserted  in  the  connexions 
between  matrix  A  and  matrix  B,  so  that  any  one  of  four  alternative 
addresses  for  the  next  micro-order  may  be  selected  according  to 
the  settings  of  two  conditional  flip-flops.  Another  possibility  is  to 


I 

Chapter  28  |  Microprogramming  and  the  design  of  the  control  circuits  in  an  electronic  digital  computer   337  i 


make  the  output  from  the  decoding  tree  branch  before  it  enters 
matrix  A  so  that  the  nature  of  the  micro-operation  that  is  per- 
formed depends  on  the  setting  of  the  conditional  flip-flop. 

The  micro-programme  wired  on  to  the  matrices  contains  sec- 
tions for  performing  the  operations  required  by  each  order  in  the 
basic  order  code  of  the  machine.  To  initiate  the  operation  it  is 
only  necessary  that  control  in  the  micro-programme  should  be  sent 
to  the  correct  entry  point.  This  is  done  by  placing  the  fimction 
digits  of  the  order  in  the  least  significant  part  of  register  II,  the 
other  digits  in  this  register  being  made  zero.  The  micro-programme 
is  constructed  so  that  when  this  number  passes  into  register  I, 
control  in  the  micro-programme  is  sent  to  the  correct  entry  point. 

The  switching  system  in  the  arithmetical  unit  may  either  be 
designed  to  permit  a  large  variety  of  micro-operations  to  be  per- 
formed, or  it  may  be  restricted  so  as  to  allow  only  a  small  number 
of  such  operations.  In  a  machine  with  a  comprehensive  order  code 
there  is  much  to  be  said  for  having  the  more  flexible  switching 
system  since  this  will  enable  an  economy  to  be  made  in  the  number 
of  micro-orders  needed  in  the  micro-programme. 

A  similar  remark  applies  in  conne.xion  with  the  degree  of  fle.xi- 
bility  to  be  provided  when  designing  the  switching  system  for  the 
control  register  unit.  If  the  specification  of  the  machine  allows 
the  same  number  of  registers  to  be  used  in  the  arithmetical  and 
control  sections,  the  construction  of  these  two  sections  mav  be 
identical  except  as  far  as  the  number  of  digits  is  concerned.  In 
a  new  machine  under  constniction  in  the  Mathematical  Labora- 
tory, Cambridge,  the  registers  are  being  constructed  in  basic  imits 
each  containing  five  registers  and  an  adder-subtractor  together 
with  the  associated  switching  svsteni.  It  is  hoped  that  it  will  be 
possible  to  use  identical  units  in  the  arithmetical  imit  and  in  the 
control  register  unit. 

3.  Example 

An  example  will  now  be  given  to  show  the  way  in  which  a  micro- 
programme  can  be  drawn  up  for  a  machine  with  a  single-address 
order  code  covering  the  usual  operations.  It  is  supposed  that  the 
arithmetical  unit  contains  the  following  registers: 

A  multiplicand  register 

B  accumulator  (least  significant  half) 

C  accumulator  (most  significant  half) 

D  shift  register 

The  registers  in  the  control  register  unit  are  as  follows: 


E  register  connected  to  the  access  circuits  of  the  store;  the 
address  of  a  storage  location  to  which  access  is  required 
is  placed  here 

F  sequence  control  register;  contains  address  of  next  order  due 
to  be  executed 

G    register  used  for  counting 

It  was  assumed  when  drawing  up  the  micro-programme  that  there 
was  an  adder-subtractor  in  the  arithmetical  unit  with  one  input 
permanently  connected  to  register  D,  and  a  similar  adder-sub- 
tractor in  the  control  register  unit  with  one  input  permanently 
connected  to  register  G.  For  convenience  it  was  assumed  that  the 
switching  systems  in  each  case  were  comprehensive  enough  to 
provide  any  micro-operation  required.  It  was  further  supposed  that 
the  arithmetical  unit  provided  for  20  digits  and  that  the  numbers 
0,  1  and  18  could  be  introduced  at  will  into  one  of  the  registers 
or  the  adder  of  the  control  register  unit.  Two  conditional  flip-flops 
are  u.sed.  .Ml  micro-operations  including  those  involving  access  to 
the  store  are  supposed  to  take  the  same  amount  of  time.  Reference 
will  be  made  to  this  point  in  !;4. 

Table  1  gives  the  order  code  of  the  machine,  and  Table  2  the 
micro-programme.  Each  line  of  Table  2  refers  to  one  micro-order; 
the  first  column  gives  the  address  of  the  micro-order,  the  second 
column  specifies  the  micro-operations  called  for  in  the  arithmetical 
unit  of  the  machine,  and  the  third  column  specifies  the  micro- 


Table  1 

Notation 

Acc  = 

accumulator 

Acc\  = 

most  significant  half  of  accumulator 

ACC2  = 

least  significant  half  of  accumulator 

n  = 

storage  location  n 

C(.V)  = 

contents  of  X  {X  =  register  or  storage  location) 

Order   Effect  of  order 


A  n  C(Acc)  +  C(n)  to  Acc 

S  n  C{Acc)  —  C(n)  to  Acc 

H  n  C(n)  to  AcC2 

V  n  C{Acc2)  ■  C(n)  to  Acc,  where  C(n)  >  0 

r  n  C(Acci)  to  n,  0  to  Acc 

U  n  C{Acci)  to  n 

fln  C(Acc)  -2-i"+ii  to  Acc 

Ln  C(Acc) -2"+!  to  Acc 

G  n  If  QAcc)  <  0,  transfer  control  to  n;  if  O  Acc)  >  0,  Ignore 

(i.e..  proceed  serially) 

/  n  Read  next  character  on  input  mechanism  Into  n 

O  n  Send  C(n)  to  output  mechanism 


Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  3  |  Processors  defined  by  a  microprogram 


Table  2 


Notation:  A,  B,  C,  .  .  . 

stand  for  the  various  registers  in  the  arithmetical  and  control  register  units  (see  §3  of  the  text).  'C  to  D'  indicates  that 

the  switching  circuits  connect  the  output  of  register  C  to  the  input  of  register  D;  '{D  +  A}  to  C  indicates  that  the  output  of  reg 

ister  A  is 

con- 

nected  to  the  one  input  of  the  adding  unit  (the  output  of  D  is  permanently  connected  to  the  other  input),  and  the  output  of  the  adder  to  register  C. 

A  numerical  symbol  u 

in  quotes  (e.g.,  'n')  stands  for  the  source  whose  output  is  the  number  n  in  units  of  the  least  significant  dij 

git. 

y^OHCiltlOnCll 

Next 

pip-jiop 

micro-order 

Arithmetical 

Control   

unit 

register  unit                                   Set  Use 

Q 

]^ 

u 

FtoG  and  E 

1 

1 

1 

(G-f '1')  to  F 

2 

o 

tL 

Store  to  G 

3 

G  to  £ 

4 

4 

£  to  decoder 

- 

A  5 

Cto  D 

16 

S  6 

Cto  D 

17 

H  7 

Store  to  B 

0 

V  8 

Store  to  A 

27 

r  9 

C  to  Store 

25 

V  10 

C  to  Store 

0 

K  11 

B  to  D 

£to  G 

19 

L  12 

Cto  D 

£to  G 

22 

G  13 

EtoG  (1)C,, 

18 

/  14 

Input  to  Store 

0 

t'  ID 

Store  to  Output 

0 

(D  + Store)  to  C 

0 

1  7 
1  / 

(D- Store)  to  C 

0 

1  S 

io 

1 

0 

1 

19 

DtoB(B)t 

(G-'l')to  £ 

20 

Cto  D 

WE, 

21 

21 

D  to  C  (fi) 

1 

11 

0 

D  to  C  (L)i: 

(G-T)  to  E 

23 

Bto  D 

(1)£., 

24 

D  to  B  (L) 

1 

12 

0 

25 

'0'  to  B 

26 

26 

B  to  C 

0 

27 

'0'  to  C 

'18'  to  £ 

28 

28 

B  to  D 

EtoG  (1)B, 

29 

29 

D  to  B  (R) 

(G-T)  to  £ 

30 

30 

Cto  D(B) 

(2)£.,  1 

31 

32 

31 

Dto  C 

2 

28 

33 

32 

(D+A)  to  C 

2 

28 

33 

33 

Bto  D 

(1)B, 

34 

34 

D  to  B  (B) 

35 

35 

C  to  D  (K) 

1 

36 

37 

36 

D  to  C 

0 

37 

(D-A)  to  C 

0 

t  Right  shift.  The  switching  circuits  in  the  arithmetic  unit  are  arranged  so  that  the  least  significant  digit  of  register  C  is  placed  in  the  most  significant  place  of  register 

B  during  right  shift  micro-operations,  and  the  most  sij 

jnificant  digit  of  register  C  ( sign  digit)  is  repeated  (thus  making  the  correction  for  negative  numbers). 

t  Left  shift.  The  switching  circuits  are  similarly  arranged  to  pass  the  most  significant  digit  of  register  B  to  the  least  significant  place  of  register  C  during  left  shift  micro- 
operations. 


Chapter  28  |  Microprogramming  and  the  design  of  the  control  circuits  in  an  electronic  digital  computer  339 


operations  called  for  in  the  control  register  unit.  The  fourth  col- 
umn shows  which  conditional  flip-flop,  if  any,  is  to  be  set  and  the 
digit  which  is  to  be  used  to  set  it;  for  example,  (IjC^  means  that 
flip-flop  number  1  is  set  by  the  sign  digit  of  the  number  in  register 
C,  while  (2)G,  means  that  flip-flop  number  2  is  set  by  the  least 
significant  digit  of  the  niunber  in  register  Q.  In  the  case  of  uncon- 
ditional micro-orders  columns  5  and  7  are  blank  and  column  6 
contains  the  address  of  the  next  micro-order  to  be  executed.  In 
the  case  of  conditional  micro-orders  column  5  shows  which  flip-flop 
is  used  to  operate  the  conditional  switch  and  columns  6  and  7 
give  the  alternative  addresses  to  which  control  is  to  be  sent  when 
the  conditional  flip-flop  contains  a  0  or  a  1  respectively. 

Micro-orders  0  to  4  are  concerned  with  the  extraction  of  orders 
from  the  store.  They  serve  to  bring  about  the  transfer  of  the  order 
from  the  store  to  register  E  and  then  cause  the  five  most  significant 
digits  of  the  order  to  be  placed  in  register  II  with  the  result  that 
control  is  transferred  to  one  of  the  micro-orders  5  to  15,  each  of 
which  corresponds  to  a  distinct  order  in  the  machine  order  code. 
In  this  way  the  seciuence  of  micro-orders  needed  to  perform  the 
particular  operation  called  for  is  begiui. 

The  way  in  which  the  various  operations  are  performed  can 
be  followed  from  Table  2.  In  the  section  dealing  with  multipli- 
cation, it  is  assiuned  that  numbers  lie  in  the  range  —  I  <  .t  <  1 
and  that  negative  numbers  are  represented  in  the  machine  by  their 
complements  with  respect  to  2.  It  will  be  noted  that  the  process 
of  drawing  up  a  micro-programme  is  verv  similar  to  that  of  draw- 
ing up  an  ordinary  programme  for  an  automatic  computing  ma- 
chine and  the  problems  involved  are  verv  mvich  alike. 

4.    The  timing  of  micro-operations 

The  assumption  that  all  micro-operations  take  the  same  length 
of  time  to  perform  is  not  likely  to  be  borne  out  in  practice.  In 
particular  in  a  parallel  machine  it  may  not  be  possible  to  design 
an  adder  in  which  the  carry  propagation  time  is  sufficientlv  short 
to  enable  an  addition  to  be  performed  in  substantially  the  same 
length  of  time  as  that  taken  for  a  simple  transfer.  It  will  be  neces- 
sary, therefore,  to  arrange  that  the  wave-form  generator  feeding 
the  decoding  tree  should,  when  suitably  stimulated  by  a  pulse  from 
one  of  the  outputs  from  matrix  A,  supplv  a  somewhat  longer  pulse 
than  that  normally  required.  Other  operations  may  take  many  times 
as  long  to  perform  as  an  ordinarv  micro-order;  for  example,  access 
to  and  from  the  store  (particularly  if  a  delay  store  is  used)  and 
operation  of  the  input  and  output  devices  of  the  machine.  The 
sequence  of  operations  in  the  micro-programme  must  therefore 
be  interrupted.  One  way  of  doing  this  is  to  prevent  pulses  from 


the  wave-form  generator  reaching  the  decoding  tree  during  the 
waiting  period.  This  method,  although  quite  feasible,  appears  to 
involve  just  the  kind  of  complication  which  the  present  system 
is  designed  to  avoid.  .\  more  attractive  system  is  to  make  the 
machine  wait  on  a  conditional  micro-order  which  transfers  control 
back  to  itself  unless  the  associated  conditional  flip-flop  is  set. 
Setting  of  this  flip-flop  takes  place  when  the  operation  is  com- 
pleted, and  control  then  goes  to  the  next  micro-order  in  the  se- 
quence. The  machine  is  thus  in  a  condition  of  "dvnamic  stop'  while 
waiting  for  the  operation  to  be  completed.  This  system  has  the 
advantage  that  no  complication  is  introduced  into  the  units  sup- 
plying the  wave-forms  to  the  decoding  tree  and  that  the  control 
equipment  required  is  similar  to  that  already  provided  for  other 
purposes. 

5.  Discussion 

It  will  l)e  seen  thai  the  equipment  needed  to  execute  a  compli- 
cated order  in  the  machine  order  code  is  of  the  same  form  as  that 
required  for  a  simple  one,  namelv  outlets  from  the  decoding  tree 
and  diodes  in  the  matrices.  Quite  complicated  orders  can,  there- 
fore, be  built  into  the  machine  without  difficulty.  In  particular, 
arithmetical  operations  on  numbers  expressed  in  floating  binary 
form  and  other  similar  operations  can  be  micro-programmed  and 
it  is  found  that  they  do  not  involve  very  large  numbers  of  micro- 
orders.  For  example,  a  micro-programme  providing  for  the  float- 
ing-point operations  of  addition,  subtraction,  and  multiplication 
needs  about  70  micro-orders.  The  switching  system  in  the  arith- 
metical unit  must,  of  course,  be  designed  with  these  operations 
in  view.  The  decoding  tree  and  matrices  of  a  parallel  machine 
with  40  digits  in  the  arithmetical  unit  and  provision  for  256 
micro-orders  would  only  amount  to  about  15%  of  the  total  equip- 
ment in  the  machine,  so  that  it  appears  that  such  a  machine  can 
well  be  provided  with  built-in  facilities  of  considerable  complexity. 

The  number  of  micro-orders  needed  in  a  complicated  micro- 
progrannne  can  sometimes  be  reduced  by  making  use  of  what 
might  be  called  micro-subroutines.  For  example,  when  two  num- 
bers have  to  be  added  together  in  a  floating  binary  machine,  some 
shifting  of  one  of  them  is  usually  necessary  before  the  addition 
can  take  place.  Bv  making  the  micro-orders  for  this  shifting  opera- 
tion serve  also  when  a  multiplication  is  called  for.  considerable 
saving  is  effected. 

Four  registers  is  the  bare  minimum  needed  in  the  arithmetical 
unit  in  order  to  enable  the  basic  arithmetical  operations  to  be 
performed.  If  any  extension  or  refinement  of  the  facilities  provided 
is  required,  it  may  be  necessary'  to  increase  the  number  of  registers. 


340  Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  3     Processors  defined  by  a  microprogram 


For  example,  four  registers  are  not  sufficient  to  enable  a  succession 
of  products  to  be  accumulated  without  the  transfer  of  intermediate 
results  to  the  store,  since  the  accumulator  must  be  clear  at  the 
beginning  of  a  multiplication.  The  addition  of  one  register  enables 
the  accumulation  of  products  to  be  provided  for  in  the  micro- 
programme.  If  this  register  is  associated  with  the  outlet  from  the 
store,  it  also  enables  some  of  the  waiting  time  for  storage  access 
to  be  eliminated.  To  do  this  the  micro-programme  is  arranged  to 
call  for  a  number  from  the  store  as  soon  as  it  is  known  that  the 
number  will  be  required  and  to  continue  with  other  necessary 
micro-operations  before  finally  proceeding  to  use  the  number.  The 
'dynamic  stop'  would  occur  just  before  the  number  is  required  for 
use.  Another  way  of  saving  time  is  to  arrange,  in  the  case  of  those 
orders  which  permit  it,  for  the  next  order  to  be  extracted  from 
the  store  before  the  operation  currently  being  performed  has  been 
completed. 

The  minimum  number  of  registers  required  in  the  control 
register  unit  of  the  machine  for  the  simplest  mode  of  operation 
is  three.  If  extra  registers  are  provided  facilities  similar  to  those 
provided  by  the  B-lines  in  the  machine  at  Manchester  University 
could  be  included  in  the  micro-programme. 


6.    Microprogramming  applied  to  serial  machines 

All  the  discussion  so  far  has  been  with  reference  to  parallel  ma- 
chines because  the  technique  described  in  this  paper  is  most 
adapted  to  that  type  of  machine.  It  is,  however,  possible  to  design 
a  serial  machine  along  the  same  lines.  In  a  parallel  computer  with 
an  asynchronous  arithmetical  unit  every  gate  requires  only  one 
kind  of  wave-form  to  operate  it  and  the  timing  of  that  wave-form 
is  not  critical.  In  a  serial  machine,  on  the  other  hand,  different 
gates  require  different  wave-forms  and  the  same  gate  may  require 
different  wave-forms  at  different  times;  further,  all  these  wave- 
forms must  be  critically  timed.  These  complications  may  be 
handled  by  including  in  the  micro-control  unit  a  third  matrix,  C, 
for  selecting  the  appropriate  wave  form  for  each  micro-order.  The 
main  wave-form,  routed  bv  the  decoding  tree  and  matrix  A,  opens 
a  gate  which  is  fed  by  a  wave-form  selected  by  matrix  C.  This 
enables  a  wave-form  of  correct  duration  to  be  applied  to  any 
selected  gate  in  the  arithmetical  or  control  sections  of  the  ma- 
chine. 

References 

WilkM.51a;  BoiitE63;  FlynM67;  GreeJ64,  66;  MercR57;  Patz67;  RosiR69; 
TuckS67;  WilkM58fc,  69;  WebeH67 


Chapter  29 


The  design  of  a  general-purpose 
microprogram-controlled  computer 
with  elementary  structure^ 


Thomas  W.  Kmnpe 

Summary  This  paper  presents  the  design  of  a  parallel  digital  conipuler 
utilizing  a  20-fisec  core  memory  and  a  diode  storage  microprogram  unit. 
The  machine  is  intended  as  an  on-line  controller  and  is  organized  for  ease 
of  maintenance. 

A  word  length  of  19  bits  provides  .31  orders  referring  to  memor\-  loca- 
tions. Fourteen  bits  are  used  for  addressing,  12  for  base  address,  one  for 
index  control,  and  one  for  indirect  addressing.  A  32nd  order  permits  the 
address  bits  to  be  decoded  to  generate  special  functions  which  require  no 

address. 

The  logic  of  the  machine  is  resistor-transistor;  the  arithmetic  unit  is 
a  bus  structure  which  permits  many  variants  of  order  structure. 

In  order  to  make  logical  decisions,  a  "general-purpose"  logic  unit  has 
been  incorporated  so  that  the  microcoder  has  as  much  freedom  in  this  area 
as  in  the  arithmetic  unit. 


Introduction 

This  paper  discusses  the  logical  design  of  a  binary,  parallel,  real- 
time computer.  Only  those  aspects  of  packaging  and  circuitry 
which  bear  directly  on  this  topic  will  be  considered. 

Since  the  specifications  for  the  job  a  computer  is  to  perform 
are  not  enough  to  fi.\  the  design,  the  logical  designer  is  faced  with 
an  undetermined  system.  One  of  his  main  fimctions  is  to  analyze 
the  system  in  its  natural  environment,  i.e..  with  malfimctions, 
operator  errors,  etc.,  and  to  supply  the  remainder  of  the  side 
conditions  which  do  fix  the  design. 

In  this  discussion,  the  exposition  will  be  directed  toward  the 
design  philosophy  which  led  to  a  machine  now  being  built.  In 
order  to  accomplish  this,  we  shall  consider  the  functional  require- 
ments, their  analysis  in  terms  of  the  state  of  the  art,  the  basic 
design  decisions,  and,  finally,  a  description  of  the  computer  as  it 
stands. 

'/«£  Tram.,  EC-9,  vol.  2,  pp.  208-213,  June,  1960. 


Functional  requirements 

The  design  of  the  computer  i known,  for  a  variety  of  reasons,  as 
the  SD-2)  was  undertaken  to  supply  a  computer  capable  of  mod- 
erately fast  arithmetic  with  perhaps  five  decimal  places  of  accu- 
racy and  .3(XK)  or  more  words  of  storage.  Furthermore,  the  com- 
puter must  reside  in  a  hostile  environment  (a  small  house,  0°  to 
85°C  temperature),  withstand  severe  shocks,  and  be  maintained 
by  men  with  only  two  weeks  training  on  the  system.  The  volume 
limitation  is  40  cubic  feet.  Within  this  space  must  reside  the 
control  computer.  memor\',  power  supplies,  complete  maintenance 
facilities,  and  sufficient  input  output  equipment  to  handle  20  shaft 
position  outputs,  .30  such  inputs,  numerous  switch  settings,  and 
20  or  more  display  or  relay  signals. 

The  final  specification  (or  blow)  was  that  15  months  were 
available  from  the  start  of  preliminary  design  to  the  delivery  of 
an  operating  instrument  with  debugged  program. 

Design  analysis 

The  maintenance  requirement  was  evidently  the  major  problem. 
In  order  to  achieve  the  simplicity  required,  two  design  criteria 
were  necessary. 

First,  the  computer  had  to  be  readily  understood.  This  implied 
that  the  usual  clever  logical  tricks  such  as  intensive  time  sharing 
of  control  and  arithmetic  were  undesirable. 

Second,  if  built-in  maintenance  facilities  were  to  be  kept  sim- 
ple, the  machine  must  be  designed  with  this  in  mind. 

Since  temperature  and  reliability  were  important,  an  extremely 
conservative  approach  had  to  be  taken  with  respect  to  component 
performance. 

With  the  schedule  requirements,  a  machine  which  could  be 
designed  and  released  in  pieces  was  needed.  Since  the  control 
system  is  usually  the  most  troublesome  part  of  a  computer  to 
design,  a  simple  control  was  needed. 


341 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  3  |  Processors  defined  by  a  microprogram 


The  volume  available,  together  with  the  schedule,  required  a 
logical  design  with  natural  packaging  properties  in  the  sense  that 
it  should  break,  in  a  natural  way,  into  logical  packages  of  a  reason- 
able size  having  a  minimum  of  interpackage  communication. 

Design  decisions 

The  need  for  2000  operations  per  second  poses  a  serious  access 
problem  with  a  serial  memory,  unless  one  resorts  to  several  simul- 
taneously operating  control  units  which  are  neither  small  nor 
simple.  Hence,  a  random  access  memory  seemed  advisable.  Mag- 
netic core  memories  at  85°C  are  a  problem,  but  they  can  be  built, 
provided  memory  cycle  time  is  not  too  short.  The  memory  was 
chosen  as  4096  words  of  core  storage,  with  a  20-/xsec  cycle  time. 

The  requirement  for  training  a  man  in  two  weeks  to  maintain 
the  machine  argues  for  a  simple-structured  parallel  machine. 
Providing  that  much  use  is  made  of  asynchronous  transfer,  there 
are  a  variety  of  simple  maintenance  methods,  particularly  if  a  bus 
structure  is  adopted.  Also,  asynchronous,  or  semi-asynchronous, 
parallel  machines  require  only  average  performance  of  a  set  of 
components,  not  of  any  particular  component;  the  central  limit 
theorem  of  statistics  can  come  to  the  aid  of  reliability.  This  ap- 
proach was  finally  adopted. 

The  simplicity  of  both  design  and  understanding  is  aided  by 
the  use  of  a  microprogram  control  system.  Further,  maintenance 
is  made  rather  simple  by  two  provisions  on  the  maintenance  con- 
sole. 

The  first  of  these  is  a  manner  of  going  through  the  micro- 
program on  a  step-by-step  basis.  While  this  tests  little  of  the 
dynamics,  it  can  often  locate  totally  defective  parts,  and  it  helps 
factory  checkout  immeasurably. 

The  second  is  a  means  of  taking  out  the  microprogram  unit  and 
substituting  a  set  of  switches.  This  permits  a  maintenance  man 
to  exercise  specific  registers,  or  the  memory,  at  will. 

This  is  a  powerful  tool,  and  is  almost  free  with  a  microprogram 
control.  Finally,  and  rather  pragmatically,  microprogramming 
permits  "last  minute"  changes  in  machine  operation  without  seri- 
ous hardware  modifications.  This  approach  was  chosen. 

Regardless  of  the  control  used,  at  various  times  in  the  process 
of  executing  orders,  decisions  must  be  made.  Occasionally  these 
are  on  a  single  bit,  more  often  on  two,  and  occasionally  on  more 
than  two.  If  one  excludes  order  decoding,  only  such  functions  as 
zero  detection  require  the  use  of  more  than  two  bits.  At  this  point, 
the  logical  designer  is  faced  with  a  rather  sticky  decision:  whether 
to  design  a  specific  set  of  decision  logic,  which  is  cheap  to  build 


but  sometimes  messy,  or  to  use  some  microcontrolled  logic- 
generating  scheme.  ; 

In  this  case,  the  latter  alternative  was  taken.  A  unit,  called  (for 
several  obscvire  reasons)  the  alteration  unit,  was  designed  which 
amounted  to  a  three-address,  one-bit  unit.  It  can  generate  any 
Boolean  function  of  two  binary  variables  and  transmit  this  value  j 
to  another  variable.  A  special  set  of  logic  was  needed  for  detecting 
zeros. 

Because  of  the  rather  wild  nature  of  the  inputs,  it  seemed 
desirable  to  include  a  trapping  mode.  The  logic  for  this  was  made 
an  adjunct  to  the  alteration  unit. 

The  circuitry  chosen  was  resistor-transistor  logic,  which  yields 
either  Sheffer  stroke  or  NOR  logic,  as  one  prefers,  high  or  low 
true  logic,  and  p-n-p  or  n-p-ti  transistors.  In  this  case,  the  com- 
bination was  high  true  logic  and  p-n-p  transistors,  so  that  the 
logical  operation  is  Sheffer  stroke.  Because  of  temperature  and 
reliability  requirements,  the  maximum  frequency  available  was  a 
250-kc  square  wave.  This  gave  a  cycle  time  of  4  /xsec  available  ' 
for  asynchronous  transfer  in  any  sequence  of  logic. 

An  index  register  seemed  advisable  because  of  the  amount  of 
data  processing.  Thus,  additions  were  needed  for  indexing,  arith- 
metic, and  counter  advance.  It  seemed  undesirable  to  have  more 
than  one  parallel  adder,  so  that  an  adder  accessible  to  all  registers 
was  chosen.  This  was  another  argument  for  a  bus  structure. 

Because  of  the  multiplicity  of  problems  being  handled  simul- 
taneously, one  index  register  was  not  really  enough.  Rather  than 
add  another  register,  indirect  addressing  was  chosen. 

At  this  point,  one  needs  12  bits  for  address,  one  for  index 
tagging,  and  one  to  specify  whether  the  address  is  direct  or  in- 
direct, or  14  bits  for  operand  selection.  Thirty-two  orders  was  a 
tight  minimum,  so  the  minimum  word  length  was  19  bits.  Since 
this  was  consistent  with  five  decimal  place  accuracy,  it  was  tenta- 
tively chosen.  It  was  decided,  however,  to  design  a  structure 
basically  suited  to  any  length  word. 

Shifting  is  necessary  to  multiply  and  divide  and  is  required  on 
two  registers,  yet  shift  registers  for  asynchronous  operation  are 
complex.  Hence,  it  was  decided  to  put  the  shift  facility  on  the 
data  transfer  bus.  By  providing  complementing  here,  subtraction 
could  be  generated.  ; 

It  was  decided  to  use  two-complement  arithmetic,  first  because  Ij 
of  the  simplicity  of  the  multiply-divide  logic,  and  second  because  j| 
it  avoids  the  whole  negative  zero  question.  J 

The  precise  number  of  microsteps  needed  was  determined  by  j 
a  trial  microprogram.  The  machine  was  designed  for  up  to  512  i 
microsteps  although  only  384  are  now  used.  Eight  bits  were  in  ' 


Chapter  29  I  The  design  of  a  general-purpose  microprogram  controlled  computer  with  elementary  structure  343 


a  register,  called  /,  and  one  was  a  flip-flop,  TO,  in  tfie  alteration 
unit,  thus  allowing  fixed  sequence  with  a  one-bit  micropro- 
grammed choice.  This,  incidentally,  is  the  genesis  of  the  name 
"alteration  unit." 


The  SD-2  computer 

Figure  1  is  a  block  diagram  of  the  computer.  There  will  be,  pres- 
ently, a  block-bv-block  description  of  the  computer. 

The  two  boxes  on  the  left  were  added  to  facilitate  input  and 
output.  The  output  buffer  holds  20  words,  and  outputs  all  values 
in  a  4.8-msec  cycle,  thus  providing  for  nearly  continuous  outputs. 
The  output  distributor  is  a  selection  system  which  allows  the 
programmer  to  transmit  the  contents  of  the  accumulator  onto  one 
of  eight  channels  to  control  external  devices.  The  "inputs"  line 
represents  up  to  '32  channels  which  can  be  read  into  the  accinnu- 
lator.  The  numbers  8  and  32  are  purely  arbitrarv:  the  upper  limit 
of  32  is  a  microcode  convenience  only. 

The  alteration  unit,  in  addition  to  its  decision  making  duties, 
has  several  other  functions.  It  has  a  five  bit  counter,  used  for 
microsubroutines.  which  can  be  set  to  any  value  chosen  or  to  any 
number  on  the  arithmetic  unit.  The  alteration  unit  can  sense  when 
it  goes  from  all  zeros  to  all  ones.  In  addition,  the  flip-flops  con- 


INPUTS 


MEMORY 

DATA 

I 

OUTPUT 
DISTRIBUTOR 


OUTPUT 
BUFFER 


ADDRESS 


ARITHMETIC 
UNIT 


ORDER 


MICROPROGRAM 
UNIT 


I  TRAP 
SIGNAL 


ALTERATION 
UNIT 


MEMORY 
ADDRESS 


MEMORY 
DATA 


REGISTERS 


FROM 
ELSEWHERE 


b 
BUS 


SHIFT 
UNIT 


TO 

ELSEWHERE 


d  BUS 


Fig.  1.  Computer  block  diagram. 


Fig.  2.  Arithmetic  flow. 

trolling  initial  carry  in  the  adder,  end  carrv  in  shifting,  and  meni- 
or\-  read  or  write  control  are  in  this  unit. 

Figure  2  is  a  block  diagram  of  the  arithmetic  unit.  Information 
may  be  put  onto  the  /;  bus  from  any  register,  or  from  outside 
sources,  such  a.s  inputs,  or  constants  from  the  microprogram  unit; 
thence  to  the  shift  unit,  and  finally  to  the  d  bus.  From  the  d  bus, 
it  may  be  sent  to  other  places,  such  as  the  output  distributor, 
microprogram  register,  etc.,  or  to  an  arithmetic  register. 

Data  and  addressing  between  memory  and  the  arithmetic  unit 
have  their  own  private  channels,  leaving  the  bus  free  during 
memory  operation.  The  memorv  buffer  and  address  register  are 
a  part  of  the  arithmetic  unit. 

Figure  3  is  an  expanded  view  of  this  unit.  Capital  letters  stand 
for  registers,  small  letters  for  logical  entities.  Registers  A,  B,  C 
and  E  are  siniplv  storage  registers,  and  are  used  as  the  Accumu- 
lator, B-line,  Counter  and  Extension  (least  significant  arithmetic) 
register.  The  Distributor,  D,  is  the  memory  buffer,  and  is  often 
used  as  working  storage.  Registers  F  and  G  are  the  inputs  to  the 
adder  logic.  The  a  logic  is  the  algebraic  sum  of  (F)  +  (G);  e  is 
a  rather  weird  logic,  (e  =  F  +  G,  which  is  used  in  generating 
the  extract  order);  /,  which  yields  FG  +  FG,  is  used  for  the 
"exclusive"  or  generation;  c  is  the  carry  logic;  g  is  a  constant 
emitter,  under  microprogram  control;  and  h  is  a  set  of  gates  used 
for  input. 

.\s  a  number  moves  from  b  to  d,  one  of  five  operations  may 
be  performed;  viz.,  normal,  shift  left  one  bit,  shift  right  one  bit, 
complement  or  shift  left  5  bits.  The  last  is  used  for  automatic  fill 
and  in  connection  with  the  microprogram  unit  control. 

.\s  an  example,  to  add  the  number  in  the  A  and  D  registers, 
three  microprogram  steps  would  be  needed.  First,  transfer  A  to 
G,  D  to  F,  and  finally  a  to  A;  12  jisec  would  be  required. 


344  Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  3  |  Processors  defined  by  a  microprogram 


MEMORY 
DATA 


MEMORY 
ADDRESS 


EXTERNAL 
WORLD 


SHIFT 
UNIT 


Fig.  3.  Arithmetic  unit  detail. 


Figure  4  is  a  diagram  of  the  microprogram  unit.  The  eight-bit 
/  register,  augmented  by  the  TO  flip-flop  of  the  alteration  unit, 
is  decoded  for  up  to  512  steps.  Students  of  microprogramming  will 
recognize  the  Wilkes  model  in  its  pure  form  [Wilkes  and  Stringer, 
195.3].  The  "next"  value  of  the  microprogram  register  may  be 
chosen  in  one  of  three  ways. 

First,  the  value  may  be  controlled  by  the  microprogram  itself. 

Second,  five  bits  of  the  bus,  corresponding  to  the  order  portion 
of  the  word,  may  be  entered;  the  other  three  bits  are  set  to  zero. 
In  this  manner,  the  order  decoding  is  accomplished. 

Third,  all  eight  bits  of  the  /  register  may  be  filled  from  the 
d  bus.  In  practice,  the  order  is  shifted  five  bits  to  the  left,  pre- 
senting eight  bits  of  the  address  to  get  the  /  register.  In  this 
manner,  one  may  generate  "no  address"  commands. 

In  principle,  the  programmer  may  start  on  any  microstep  which 
amuses  him;  in  practice,  onlv  a  limited  number  of  these  will  yield 
no-address  orders,  the  other  steps  being  used  for  parts  of  add, 
subtract,  order  procure,  etc.  The  author  has  no  doubt,  however. 


that  someone  will  find  a  useful  reason  for  popping  into  the  middle 
of  divide  or  some  other  command.  There  is  no  feature  of  a  ma- 
chine, however  pathological,  which  cannot  be  exploited  by  a 
programmer. 

The  actual  decoding  of  these  nine  bits  is  accomplished  partly 
by  logic,  and  partly  bv  current  switching  of  the  clock  pulse.  A 
diode  matrix  is  used  to  convert  the  microsteps  into  control  signals. 

No  more  than  15  micro  operations  may  be  called  out  on  a  single 
step,  including  selection  of  the  next  microorder. 

When  stepping  the  microregister,  a  ploy  is  used  to  reduce  the 
number  of  diodes.  Instead  of  specifying  the  next  step,  the  micro- 
coder  specifies  the  bits  of  /  which  he  wishes  to  reverse.  Instead 
of  the  minimum  latency  coding  of  earlier  davs,  the  microcoder 
of  the  SD-2  must  do  minimum  diode  coding.  This  is  roughly  anal- 
ogous to  asking  for  a  fast,  efficient  computer  program  containing 
a  minimum  of  I  s.  The  author,  as  well  as  others,  has  spent  endless 
hours  trying  to  devise  a  computer  program  to  do  such  microcoding, 
with  no  results. 

One  may  note  in  passing  that  the  man  who  wrote  the  micro- 
code, Tomo  Hayata,  has  for  several  years  specialized  in  advanced 
programming  problems.  Wilkes'  views,'  that  logical  design  will 
in  the  future  be  done  by  programmers,  seem  to  be  verified  here. 
Because  of  the  limited  microarithmetic  available  here,  micro- 
coding of  the  highest  order  is  a  must,  since  each  microstep  is  4 
/isec  of  time. 

For  simple  orders  (e.g.,  extract),  the  processes  of  order  procure, 
indexing  (but  not  indirect  addressing),  operand  procure  and  exe- 
cution can  be  compressed  into  the  time  for  two  memory  cycles. 
i.e.,  40  fisec.  Each  indirect  reference  adds  another  memory  cycle 

'Private  communication;  .'\ug.  17,  1959. 


CLOCK 


1  

1  

— (h 

— < 

>— 

—*>■ 

>— 

—,  '  CONTROL 

I  OUTPUTS 


Fig.  4.  Microprogram  unit. 


Chapter  29  I  The  design  of  a  general-purpose  microprogram-controlled  computer  with  elementary  structure  345 


to  this  time.  Only  on  multiply,  divide,  and  shift  does  the  ultra- 
simple  structure  begin  to  be  expensive  in  time. 

If  the  temperature  requirement  were  not  imposed,  the  clock 
frequency  could  be  doubled,  materially  improving  the  perform- 
ance of  the  machine  on  multicycle  orders. 

Figure  5  is  a  block  diagram  of  the  alteration  unit.  It  consists 
of  gates  which  permit  entry  of  conditions  within  the  computer 
or  the  outside  world,  flip-flops  used  as  working  storage,  flip-flops, 
including  TO,  to  make  its  conclusions  known  to  all  and  sundry, 
a  five-bit  tally  register  (/),  a  circuit  to  detect  a  zero  on  the  d  bus, 
and  the  trap  logic.  There  are  as  many  as  20  input  gates,  9  storage 
flip-flops  and  10  output  flip-flops,  exclusive  of  TO. 

The  /  register  can  change  its  contents  in  one  of  two  ways,  viz.. 
counting  down  by  one,  or  bv  accepting  an  entry  from  the  (/  bus. 
It  may  transmit  intelligence  in  two  ways,  viz.,  to  the  h  bus,  or 
by  notifying  the  input  gate  system  that,  should  anyone  care,  it 
has  just  counted  past  zero. 

The  zero  detector  signals  the  truth  of  the  statement  that  d  is 
identically  zero.  In  practice,  it  checks  only  the  lower  digits,  not 
the  sign.  This  is  related  to  the  existence  of  the  number  —  1  in 
a  two-complement  system,  which  is  the  .system's  answer  to  the 
negative  zero  of  a  one's  complement  logic. 

The  trap  logic  is  as  follows:  one  of  the  output  signals  of  the 
alteration  unit  signals  whether  or  not  the  system  is  receiving  trap 
signals;  if  it  is  not,  the  trap  logic  makes  a  note  of  callers.  When 
the  system  is  again  accepting  those  signals,  it  transmits  whether 
or  not  signals  have  been  received,  and  resets  its  memory  to  zero. 
The  timing  is  such  that  no  trap  signal  will  ever  be  lost. 


I  REGISTER 


ZERO 
DETECT 


INPUT 
GATES 


STORAGE 
FLIP-FLOPS 


OUTPUT 
FLIP  FLOPS 


TRAP  LOGIC 


LOGIC 
UNIT 


The  lines  going  into  the  logic  unit  are  actually  two  busses.  Any 
logic  source  may  read  to  either  bus.  The  logic  unit  has  four  control 
wires  from  the  microprogram  unit,  specifying  which  of  the  16 
Boolean  functions  of  the  two  busses  is  to  be  put  on  the  output 
bus.  This  value  is  then  routed  to  the  appropriate  logic  destination. 

The  output  flip-flops  have  inputs  from  the  logic  unit,  and  their 
outputs  go  to  various  control  points  in  the  machine.  Three  major 
points  are:  (1)  establishing  whether  a  memory  cycle  is  read,  restore 
or  erase/ write;  (2)  setting  the  initial  carry  in  the  adder;  and  (3) 
determining  what  value  shall  shift  into  the  vacant  spot  on  a  left 
or  right  shift. 

The  initial  carrN'  is  used  for  more  than  simplv  adding  one  to 
a  value;  since  the  logic  is  two  complement,  but  the  one  comple- 
ment one  is  transmitted  on  the  bus,  the  initial  carry  is,  in  general, 
one  during  subtraction  and  zero  during  addition. 

Microprogram  details 

Figure  6  gives  circuit  details  of  the  microprogram  decode  system. 
The  nine  flip-flops  used  are  broken  into  two  groups,  one  of  four, 
the  other  of  five  flip-flops.  These  are  decoded  into,  respectively, 
16  and  .32  wires.  In  each  group,  one  and  only  one  wire  goes  nega- 
tive. When  the  clock  signal,  of  2  /isec  width,  is  applied  to  the 
emitters  of  the  first  set  of  16  gates,  it  is  passed  by  the  selected 
gating  transistor.  From  the  collector  of  this  transistor,  it  is  routed 
to  the  emitter  of  a  set  of  .32  transistors;  again,  only  one  can  pass 
current.  Thus,  the  clock  signal  is  routed  to  one  of  16  X  32  X  512 
lines.  Diodes  on  the  selected  line  then  cause  this  signal  to  Tie 
routed  to  appropriate  gates  in  the  arithmetic  or  alteration  unit. 

By  appropriate  placement  of  diodes,  a  microstep  can  operate 
a  variety  of  gates,  the  number  of  which  is  limited  by  the  current 
available. 

Some  of  the  microcontrol  wires  return  to  the  /  register  so  that 
the  microcoder  may  control  the  selection  of  the  next  microstep. 
This  register  is  so  designed  that  the  actual  change  of  state  is 
inhibited  until  the  clock  goes  negative. 

While  each  output  of  the  decoding  trees  mav  go  to  16  bases, 
only  one  transistor  of  the  16  will  have  a  signal  on  the  emitter; 
thus  only  one  must  be  driven. 

From  an  engineering  point  of  view,  the  control  of  a  computer 
is  an  elaborate  timing  system.  A  microprogram  unit  is  thus  a 
programmable  timing  generator.  The  gating  transistor  diode  de- 
coding system  is  but  one  of  many  ways  to  achieve  this. 

Wilkes  has  observed'  that,  with  the  diode  system,  one  has  an 


Fig.  5.  Alteration  unit. 


'M.  V.  Wilkes,  private  communication;  .Aug.  17,  1959. 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  3  |  Processors  defined  by  a  microprogram 


JTL 


SL 


SELECT 


-TL 

SELECTED 
MICROSTEP 


DECODE 


SELECTED 
"LT 


-TL 

CLOCK 


MICROCONTROL 
SIGNALS 


Fig.  6.  Details  of  the  microdecode  system. 


acute  packaging  problem.  He  and  hi.s  co-workers  have  been  led 
to  consider  the  use  of  switch-core  decoding  [Wilkes  et  al.,  1958fl]. 

Eachus'  and  his  co-workers  have  evolved  yet  another  switch- 
core  system  which  does  not  depend  on  coincident  current  switch- 
ing. 

Order  code 

Since  the  order  code  is  only  a  small  problem  in  the  design  of  a 
microprogrammed  machine  (GOTT  SEI  DANKE),  there  is  little 
need  to  dwell  on  it.  There  are  several  comments  of  design  interest, 
however. 

We  were  unable,  with  this  structure,  to  get  the  multiplication 
below  five  microsteps  per  iteration,  nor  the  divide  below  six,  thus 
costing  respectively  20  and  24  fisec  per  bit  dealt  with.  Moreover, 
division  required  some  precalculations  (overflow  detect)  and  some 

'Dr.  Joseph  Eachus  of  Minneapolis-Honeywell,  private  conversation;  Sep- 
tember, 19.59. 


postcalculation  (obtaining  a  rounded  quotient  with  a  correct  re- 
mainder) which  further  boosted  its  time. 

Because  of  the  asynchronous  nature  of  transfer,  it  is  not  possible 
to  read  into  and  out  of  a  register  simultaneously.  Hence,  shifting 
one  register  requires  two  steps,  or  8  /xsec  per  bit,  and  double-length 
shifting  requires  16  |usec.  This  is  painful. 

Because  of  the  short  words,  four  double-length  orders  were 
microprogrammed:  add,  subtract,  clear  and  add,  and  store.  These 
take  a  total  of  60  jusec  to  execute. 

A  rich  collection  of  branch  orders  was  included.  BRanch  Un- 
conditionally, BRanch  Negative,  and  BRanch  Zero  are  self- 
explanatory.  BRanch  on  B  is  the  tally  loop  order  which  decreases 
(B)  by  one,  and  branches  if  it  does  not  go  negative.  BRl,  BR2, 
BR3,  and  BR4  are  sense  toggle  branch;  if  the  toggle  is  set,  it  is 
turned  off  and  the  program  branches.  These  sense  toggles  are 
actually  storage  flip-flops  Tl,  T2,  T.3,  and  T4  of  the  alteration  unit. 
These  may  be  set  by  other  orders.  Tl  is  also  used  as  an  overflow 
mark. 


Chapter  29  |  The  design  of  a  general-purpose  microprogram-controlled  computer  with  elementary  structure  347 


The  machine  has  a  "dynamic"  idle.  When  it  is  halted,  either 
externally  or  by  order,  this  fact  is  observed  by  the  microprogram, 
through  the  alteration  unit,  whereupon  the  microprogram  goes 
into  a  tight  loop,  continuously  asking,  "Can  I  go?  Can  I  go?  Can 
I  go?  .  .  .  ."  Two  forms  of  halting  are  provided.  In  "Halt  and 
Display,"  registers  are  presented;  in  the  other  halt,  the  console 
lights  are  left  unaltered.  A  manual  halt  is  equivalent  to  halt  and 
display. 

For  an  addressed  order,  bit  positions  one  through  five  are  sent 
into  the  microprogram  unit.  During  order  procure,  the  micro- 
program examines  bits  zero  and  six  for  indirect  addressing  and 
index  modification. 

A  nonaddress  order  is  recognized  t)y  the  binary  e(|uivalent  of 
31  in  the  order  bits;  the  microprogram  unit  causes  the  order  word 
to  shift  left  5  bits,  and  the  8  high  bits  of  the  "address"  field  enter 
the  /  register. 

Conclusion 

This  paper  is  not  intended  to  be  an  argument  in  favor  of  the 
general  acceptance  of  the  SD-2  structure  as  an  ideal.  Like  all 
computers,  the  SD-2  is  a  state-of-the-art  device,  intended  not  only 
to  meet  the  needs  of  the  problems  at  hand,  but  also,  more  impor- 
tantly, to  meet  the  side  conditions  of  its  use.  In  a  vague  analog\', 
the  computer  specification  is  like  a  partial  differential  equation. 
The  logical  designer  must  choose  the  boundar\  conditions  and 
solve  the  problem,  or  at  least  approximate  the  solution. 

With  today's  emphasis  on  system  speed  performance,  some 
serious  mental  gear-shifting  on  the  designer's  part  is  required  in 
order  to  design  a  simple  machine.  It  goes  against  the  grain  of 
instinct  and  experience.  A  posteriori,  the  SD-2  could  have  been 
made  even  simpler,  particularly  with  respect  to  several  peripheral 
areas  not  discussed  in  the  paper. 


Several  conclusions  can  be  drawn  here,  however.  The  bus 
structure  is  easy  to  fabricate  and  maintain;  this  has  been  proven 
on  the  .\IILS.\I  AC,  a  breadboard  for  the  SD-2.  If  is  a  highly  fle.vible 
stnicture,  permitting  wide  variation  in  order  code  with  no  change 
in  arithmetic  unit.  M  the  same  time,  the  components  are  ca,scaded 
to  a  point  where  one  has  the  absurd  situation  of  fast-switching 
in  a  relatively  slow  computer.  .\  designer  of  a  bus-structured 
machine  would  do  well  to  consider  alternatives,  such  as  multiple 
bu.sses,  accumulators,  etc.,  to  permit  more  parallelism  when  speed 
is  important. 

The  use  of  a  special-purpose  logic  unit,  such  as  the  alteration 
unit  of  the  SD-2,  gives  a  freedom  of  design  not  possible  with  a 
special-purpose  logic.  At  the  same  time,  it  uses  more  parts,  is  slow- 
in  handling  multiple  variable  problems,  and  requires  a  great  deal 
of  control  input.  It  appears  to  be  a  weapon  of  opportunity. 

The  use  of  microprogramming  is  much  the  same  as  the  general 
logic  unit.  Its  flexibility  and  speed  of  design  are  unquestionable. 
.\lso,  it  uses  more  parts  than  a  special-purpose  control. 

There  is  no  real  substitute  for  a  special-purpose  design.  The 
u.se  of  generalized  elements  in  computer  design  can  be  justified 
only  by  the  side  conditions,  never  by  the  basic  specification. 
Where  simplicity  and  speed  of  design  are  major  items,  their  use 
seems  indicated. 

Wilkes  once  presented  a  paper  on  the  best  way  to  design  a 
computer  and  launched  the  microprogramming  notions.  The 
author  would  like  to  comment  that  if  ease  and  reliability  of  design  i 
are  criteria,  he  was  absolutely  correct.  ' 

I 
I 

References 

i 

KampTfiO;  W  ilkM.5.3fl;  \VilkM58a  ! 


Section  4 

Processors  based  on  a  programming 
language 

Programming-language-based  processors  are  described  In 
Chap.  3  (page  73).  Three  examples  are  presented  in  this  sec- 
tion. Two  of  the  languages,  FORTRAN  and  EULER,  are  algebraic 
languages  operating  on  conventional  data  types,  whereas  I  PL- VI 
is  more  like  a  conventional  machine  language  operating  on 
unconventional  data  types  (i.e.,  list  structures).  A  peculiar  fea- 
ture of  IPL-VI  is  its  conception  of  data  as  program  (as  well  as 
of  program  as  data)  and  the  multiprogramming  organization 
to  which  this  led. 

A  command  structure  for  complex  information  processing 

The  IPL-VI  processor  (Chap.  30)  discussed  in  Part  3,  Sec.  5, 
is  an  outgrowth  of  the  I  PL  series  of  programming  languages 
by  Newell,  Shaw,  and  Simon.  The  paper  seriously  treats  both 
the  language  and  the  merits  of  casting  a  language  in  a  hardware 
processor.  IPL-VI  was  never  implemented  in  hardware.  (A  partial 
IPL  V  processor  for  the  CDC  3600  was  built  at  the  Argonne 
National  Laboratory.)  A  hardware  processor  for  IPL-VI  in  the 
third  generation  would  undoubtedly  exist  as  an  interpreter  in 
a  microprogrammed  processor. 

System  design  of  a  FORTRAN  machine 

This  paper  (Chap.  31)  presents  a  way  to  map  a  software  pro- 
gram into  hardware.  The  machine's  passes  (or  modes)  corre- 


spond to  activities  one  would  see  when  compiling,  loading,  and 
executing  a  FORTRAN  program. 

BCD  format  is  used  for  the  arithmetic.  The  symbol  table  is 
simply  organized  and,  therefore,  has  to  be  searched.  A  more 
serious  approach  for  the  actual  implementation  of  such  a 
machine  might  follow  the  lines  of  EULER  (Chap.  32). 

A  microprogrammed  implementation  of  EULER 
on  IBM  System  360/ Model  30 

This  very  clearly  written  paper  describes  a  processor  to  imple- 
ment an  ALGOL-like  language  [Wirth  and  Weber,  1966].  An 
earlier  processor  was  proposed  to  directly  execute  ALGOL 
[Anderson,  1961].  It  is  implemented  using  the  Model  30  IBM 
System/360  P. microprogrammed.  We  include  the  paper  both 
because  it  describes  the  Model  30  and  because  of  EULER. 

The  P. language  operates  like  a  conventional  compiler  and 
operating  system.  The  description  presents  clearly  the  process 
of  compiling  before  execution. 

The  microprogramming  aspects  of  the  Model  30  are  typical 
of  other  IBM  System/360  models.  The  IBM  approach  to  a 
P. microprogrammed  Is  significantly  different  from  that  in 
Kampe's  SD-2  (Chap.  29).  In  the  360  a  microprogram  instruc- 
tion is  encoded  in  a  long  word  (60  to  100  bits,  depending  on 
the  model)  with  a  number  of  microcoded  operations  which  can 
be  selected  in  parallel.  The  SD-2  uses  a  short  word,  and  only 
one  operation  is  encoded  in  a  single  instruction. 


348 


Chapter  30 


A  command  structure  for  complex 
information  processing^ 

/.  C.  Shaw  /  A.  Newell  /  H.  A.  Simon  /  T.  O.  Ellis 


The  general-purpose  digital  computer,  by  virtue  of  its  large  ca- 
pacity and  general-purpose  nature,  has  opened  the  possibility  of 
research  into  the  nature  of  complex  mechanisms  per  se.  The  chal- 
lenge is  obvious:  humans  carry  out  information  processing  of  a 
complexity  that  is  truly  baffling.  Given  the  urge  to  understand 
either  how  humans  do  it.  or  alternativelv,  what  kinds  of  mecha- 
nisms might  accomplish  the  same  tasks,  the  computer  is  turned 
to  as  a  basic  research  tool.  The  varieties  of  complex  information 
processing  will  be  understood  when  the\'  can  be  synthesized:  when 
mechanisms  can  be  created  that  perform  the  same  processes. 

The  last  few  years  have  seen  a  number  of  attempts  at  synthesis 
of  complex  processes.  These  have  included  programs  to  discover 
proofs  for  theorems  [Newell  et  al.,  1956,  1957li],  programs  to 
synthesize  music  [Brooks  et  al..  1957b[,  programs  to  play  chess 
[Bernstein  et  al..  1958;  Kister  et  al..  1957],  and  programs  to  simulate 
the  reasoning  of  particular  humans  [Newell  et  al.,  1958].  The  feasi- 
bility of  synthesizing  complex  processes  hinges  on  the  feasibility 
of  writing  programs  of  the  complexity  needed  to  specify  these 
proce.sses  for  a  computer.  Hence,  a  limit  is  imposed  by  the  limit 
of  complexity  that  the  human  programmer  can  handle.  The 
measure  of  this  complexity  is  not  absolute,  for  it  depends  on  the 
programming  language  he  uses.  The  more  powerful  the  language, 
the  greater  will  be  the  complexity  of  the  programs  he  can  write. 
The  authors'  work  has  sought  to  increase  the  upper  limit  of  com- 
plexity of  the  processes  specified  by  developing  a  series  of  lan- 
guages, called  information  processing  languages  (IPL's),  that  re- 
duce significantly  the  demands  made  upon  the  programmer  in  his 
communication  with  the  computer.  Thus,  the  IPL's  represent  a 
series  of  attempts  to  construct  sufficiently  powerful  languages  to 
permit  the  programming  of  the  kinds  of  complex  processes  previ- 
ously mentioned. 

The  IPL's  designed  so  far  have  been  realized  interpretivelv  on 
current  computers  [Newell  and  Shaw,  1957a].  .alternatively,  of 
course,  any  such  language  can  be  viewed  as  a  set  of  specifications 
for  a  general-purpose  computer.  An  IPL  can  be  implemented  far 

^Proc.  \V}CC,  pp.  119-128,  1958. 


more  expeditiously  in  a  computer  designed  to  handle  it  than  by 
interpretation  in  a  computer  designed  with  a  finite  different  com- 
mand structure.  The  mismatch  between  the  IPL's  designed  and 
current  computers  is  appreciable:  15()-machine  cycles  are  needed 
to  do  what  one  feels  should  take  only  2  or  .3  machine  cycles.  (It 
will  become  apparent  that  the  difficulty  would  not  be  removed 
by  "compiling  "  instead  of  "interpreting.  "  to  resurrect  a  set  of 
well-worn  distinctions.  The  operations  that  are  mismatched  to 
current  computers  must  go  on  during  execution  of  the  program, 
and  hence  cannot  be  compiled  out.) 

The  purpose  of  this  paper  is  to  consider  an  IPL  computer,  that 
is,  a  computer  constructed  so  that  its  machine  language  is  an 
information  processing  language.  This  will  be  called  language 
IPL-VI.  for  it  is  the  sixth  in  the  series  of  IPL's  that  have  been 
designed.  This  version  has  not  been  realized  interpretivelv,  but 
has  resulted  from  considering  hardware  requirements  in  the  light 
of  programming  experience  with  the  previous  languages. 

Some  limitations  must  be  placed  on  the  investigation.  This 
paper  will  be  concerned  only  with  the  central  computer,  the 
command  structure,  the  form  of  the  machine  operations,  and  the 
general  arrangements  of  the  central  hardware.  It  will  neglect 
completeK-  input-output  and  secondary  storage  systems.  This  does 
not  mean  these  are  unimportant  or  that  they  present  only  simple 
problems.  The  problem  of  secondary  storage  is  difficult  enough 
for  current  computing  systems;  it  is  e.xceedingly  difficult  for  IPL 
systems,  since  in  such  systems  initial  memory  is  not  organized  in 
neat  block-like  packages  for  ease  of  shipment  to  the  secondary 
store. 

Nor  is  it  the  case  that  one  would  place  an  order  for  the  IPL 
computer  about  to  be  described  without  further  experience  with 
it.  Results  are  not  entirely  predictable.  IPL's  are  sufficiently  difi^er- 
ent  from  current  computer  languages  that  their  utility  can  be 
evaluated  only  after  much  programming.  Moreover,  since  IPL's 
are  designed  to  specify  large  complicated  programs,  the  utility 
of  the  linguistic  devices  incorporated  in  them  cannot  be  ascer- 
tained from  simple  examples. 

One  more  caution  is  needed  to  provide  a  proper  setting  for 


349 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


this  paper.  Most  of  the  computing  world  is  still  concerned  with 
essentially  numerical  processes,  either  because  the  problems 
themselves  are  numerical  or  because  nonnumerical  problems  have 
been  appropriately  arithmetized.  The  kinds  of  problems  that  the 
authors  have  been  concerned  with  are  essentially  nonnumerical, 
and  they  have  tried  to  cope  with  them  without  resort  to  arithmetic 
models.  Hence  the  IPL's  have  not  been  designed  with  a  view  to 
carrying  out  arithmetic  with  great  efficiency. 

Fundamental  goals  and  devices 

The  basic  aim,  then,  is  to  construct  a  powerful  programming 
language  for  the  class  of  problems  concerned.  Given  the  amount 
and  kind  of  output  desired  from  the  computer,  a  reduction  in  the 
size  and  complexity  of  the  specification  (the  program)  that  has  to 
be  written  in  order  to  secure  this  output  is  desired. 

The  goal  is  to  reduce  programming  effort.  This  is  not  the  same 
as  reducing  the  computing  effort  required  to  produce  the  desired 
output  from  the  specification.  Programming  feasibility  must  take 
precedence  over  computing  economics;  since  it  is  not  yet  known 
how  to  write  a  program  that  will  enable  a  computer  to  teach  itself 
to  play  chess,  it  is  premature  to  ask  whether  it  would  take  such 
a  computer  one  hour  or  one  hundred  hours  to  make  a  move.  This 
is  not  meant  as  an  apology,  but  as  support  for  the  contention  that, 
in  seeking  to  write  programs  for  very  large  and  complicated  tasks, 
the  overriding  initial  concerns  must  be  to  attain  enough  flexibility, 
abbreviation,  and  automation  of  the  underlying  computing  proc- 
esses to  make  programming  feasible.  And  these  concerns  have  to 
do  with  the  power  of  the  programming  language  rather  than  the 
efficiency  of  the  system  that  executes  the  program. 

In  the  next  section  a  straightforward  description  of  an  IPL 
computer  is  begim.  To  put  the  details  in  a  proper  setting,  the 
remainder  of  this  section  will  be  devoted  to  the  basic  devices 
that  IPL-VI  uses  to  achieve  a  measure  of  power  and  flexibility. 
These  devices  include:  organization  of  memory  into  list  structure, 
provision  for  breakouts,  identity  of  data  with  program,  two-stage 
interpretation,  invariance  of  program  during  execution,  provision 
for  responsibility  assignments,  and  centralized  signalling  of  test 
results. 

List  structure 

The  most  fundamental  and  characteristic  feature  of  the  IPL's  is 
that  they  organize  memory  into  list  structures  whose  arrangement 
is  independent  of  the  actual  physical  geometry  of  the  memory  cells 
and  which  undergo  continual  change  as  computation  proceeds. 
In  all  computing  systems  the  topology  of  memory,  the  character- 


istics of  hardware  and  program  that  determine  what  memory  cells 
can  be  regarded  as  "next  to"  a  given  cell,  plays  a  fimdamental 
role  in  the  organization  of  the  information  processing.  This  is 
obviously  true  for  serial  memories  like  tape;  it  is  equally  true  from 
random  access  memories.  In  random  access  memories  the  topo- 
logical structure  is  derived  from  the  possibility  of  performing 
arithmetic  operations  on  the  memory  addresses  that  make  use  of 
the  numerical  relations  among  these  addresses.  Thus,  the  cell 
with  address  1435  is  next  to  cell  14.36  in  the  specific  sense  that 
the  second  can  be  reached  from  the  first  by  adding  one  to  the 
number  in  a  counter. 

In  standard  computers  use  is  made  of  the  static  topology  based 
on  memory  addresses  to  facilitate  programming  and  computation. 
Index  registers  and  relative  addressing  schemes,  for  example,  make 
use  of  program  arithmetic  and  depend  for  their  efficacy  upon  an 
orderly  matching  of  the  arrangement  of  information  in  memory 
with  the  topology  of  the  addressing  system. 

When  memory  is  organized  in  a  list  structure,  the  relation 
between  information  storage  and  topology  is  reversed.  The  topol- 
ogy of  memory  is  continually  modified  to  adapt  to  the  changing 
needs  of  organization  of  memory  content.  No  arithmetic  operations 
on  memory  addresses  are  permitted;  the  topology  is  built  on  a 
single,  asymmetric,  modifiable,  ordinal  relation  between  pairs  of 
memory  cells  which  is  called  adjacency.  The  system  contains 
processes  that  make  use  of  the  adjacency  relations  in  searching 
memory,  and  processes  that  change  these  relations  at  will  inex- 
pensively in  the  course  of  processing. 

A  list  structure  can  be  established  in  computer  memory  by 
associating  with  each  word  in  memory  an  address  that  determines 
what  word  is  adjacent  to  it,  as  far  as  all  the  operations  of  the 
computer  are  concerned.  Memory  space  of  an  additional  address 
associated  with  each  word  is  given  up,  so  that  the  adjacency 
relation  can  be  changed  as  quickly  as  a  word  in  memory  can  be 
changed.  Having  paid  this  price,  however,  many  of  the  other  basic 
features  of  IPL's  are  obtained  almost  without  cost:  unlimited 
hierarchies  of  subroutines;  recursive  definition  of  processes;  vari- 
able numbers  of  operands  for  processes;  and  unlimited  complexity 
of  data  structure,  capable  of  being  created  and  modified  to  any 
extent  at  execution  time. 

Breakouts 

Lang\iages  require  grammar-fixed  stnictural  features  so  that  they 
can  be  interpreted.  Grammar  imposes  constraints  on  what  can  be 
said,  or  said  simply,  in  a  language.  However,  the  constraints  created 
by  fixed  grammatical  format  can  be  alleviated  at  the  cost  of  intro- 
ducing an  additional  stage  of  processing  by  devices  that  allow  one 


Chapter  30  ^  A  command  structure  for  complex  information  processing  351 


to  "break  out"  of  the  format  and  to  use  more  general  modes  of 
specification  than  the  format  permits.  Devices  for  breakouts  ex- 
change processing  time  for  flexibility.  Several  devices  achieve  this 
in  IPL-VI.  Each  is  associated  with  some  part  of  the  format. 

As  an  illustrative  example,  IPL-VI  has  a  single-address  format. 
Without  breakout  devices,  this  format  would  permit  an  informa- 
tion process  to  operate  on  only  a  single  operand  as  input,  and 
would  permit  the  operand  of  a  process  to  be  specified  only  by 
giving  its  address.  Both  of  these  limitations  are  removed:  the  first 
by  using  a  special  communication  list  to  store  operands,  the  second 
bv  allowing  the  address  for  an  operand  to  refer  either  to  the 
operand  itself  or  to  anv  process  that  will  determine  the  operand. 

The  latter  device,  which  allows  broad  freedom  in  the  method 
of  specifying  an  operand,  illustrates  another  important  facet  of 
the  flexibility  problem.  Breakouts  are  of  great  importance  in  re- 
ducing the  burden  of  planning  that  is  imposed  on  the  programmer. 
It  is  certainly  possible,  in  principle,  to  anticipate  the  need  for 
particular  operands  at  particular  stages  of  processing,  and  to  pro- 
vide the  operands  in  such  a  wa\  that  their  addresses  are  known 
to  the  programmer  at  the  appropriate  times.  This  is  the  usual  wav 
in  which  machine  coding  is  done.  However,  such  plans  are  not 
obtained  without  cost;  they  must  be  created  by  the  programmer. 
Indeed,  in  writing  complex  programs,  the  creation  of  the  plan  of 
computation  is  the  most  difficult  part  of  the  job;  it  constitutes  the 
task  of  "programming"  that  is  .sometimes  distinguished  from  the 
more  routine  "coding."  Thus,  devices  that  exchange  computing 
time  for  a  reduction  in  the  amount  of  planning  required  of  the 
programmer  provide  significant  increases  in  the  fle.\ibilit\  and 
power  of  the  language. 

Identity  of  data  with  programs 

In  current  computers,  the  data  are  considered  "inert.  The\'  are 
symbols  to  be  operated  upon  by  the  program.  .\11  "structure"  of 
the  data  is  initially  developed  in  the  programmer's  head  and 
encoded  implicitly  into  the  programs  that  work  with  the  data.  The 
stnicture  is  embodied  in  the  conventions  that  determine  what  bits 
the  processes  will  decode,  etc. 

An  alternative  approach  is  to  make  the  data  "active."  .\I1  words 
in  the  computer  will  have  the  instruction  format:  there  will  be 
"data  programs,  and  the  data  will  be  obtained  bv  e.xecuting  these 
programs.  Some  of  the  advantages  of  this  alternative  are  obvious: 
the  full  range  of  methods  of  specification  available  for  programs 
is  also  available  for  data;  a  list  of  data,  for  example,  may  be  speci- 
fied by  a  list  of  processes  that  determine  the  data.  Since  data  are 
only  desired  "on  command"  bv  the  processing  programs,  this 
approach  leads  to  a  computer  that,  although  still  serial  in  its 


control,  contains  at  any  given  moment  a  large  number  of  parallel 
active  programs,  frozen  in  the  midst  of  operation  and  waiting  until 
called  upon  to  produce  the  next  operation  or  piece  of  data.  This 
identity  of  data  with  program  can  be  attained  only  if  the  proc- 
essing programs  require  for  their  operation  no  information  about 
the  stnicture  of  the  data  programs,  only  information  about  how- 
to  receive  the  data  from  them. 

Two-stage  interpretation 

To  identify"  the  operand  of  an  IPL-VI  instruction,  a  designating 
operation  operates  on  the  address  part  of  the  instruction  to  pro- 
duce the  actual  operand.  Thus,  depending  on  what  designating 
operation  is  specified,  the  address  part  may  itself  be  the  operand, 
mav  provide  the  address  of  the  operand,  or  mav  stand  in  a  less 
direct  relation  to  the  operand.  The  designating  operation  may  even 
delegate  the  actual  specification  of  the  operand  to  another  desig- 
nating operation. 

Invariance  of  program  during  execution 

In  order  to  carr\'  out  generalized  recursions,  it  is  necessary  to 
provide  for  the  storage  of  indefinite  amounts  of  variable  informa- 
tion necessary  for  the  operation  of  such  routines.  In  IPL-VI  all 
the  variable  information  is  stored  externally  to  the  associated 
routine,  so  that  the  routine  remains  unmodified  during  execution. 
The  name  of  a  routine  can  appear  in  the  definition  of  the  routine 
itself  without  causing  difficulty  at  execution  time. 

Responsibility  assignments 

The  automatic  handling  of  such  processes  as  erasing  a  list,  or 
searching  through  a  list  requires  some  scheme  for  keeping  track 
of  what  part  of  the  list  has  been  processed,  and  what  part  has 
not.  For  example,  in  erasing  a  program  containing  a  local  sub- 
routine that  appears  more  than  once  within  the  program,  care 
must  be  taken  to  erase  the  subroutine  once  and  onh'  once.  This 
is  accomplished  bv  a  system  for  assigning  responsibilit\'  for  the 
parts  of  the  list.  In  general,  the  responsibility  code  in  IPL-VI 
handles  these  matters  without  any  explicit  attention  from  the 
programmer,  except  in  those  few  situations  where  the  issue  of 
responsibility  is  the  central  problem. 

Centralized  signalling  of  test  results 

The  str\ieture  of  the  language  is  simplified  bv  ha\'ing  all  conditional 
processes  set  a  switch  to  symbolize  their  output  instead  of  pro- 
ducing an  immediate  conditional  transfer  of  control.  Then,  a  few 
specialized  processes  are  defined  that  transfer  control  on  the  basis 
of  the  switch  setting.  By  s\Tnbolizing  and  retaining  the  conditional 


352  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


information,  the  actual  transfer  can  be  postponed  to  the  most 
convenient  point  in  the  processing.  The  flexibility  obtained  by  this 
device  proves  especially  useful  in  dealing  with  the  transmission 
of  conditional  information  from  subroutines  to  the  routines  that 
call  upon  them. 

General  organization  of  the  machine 

The  machine  that  is  described  can  profitably  be  viewed  as  a 
"control  computer."  It  consists  of  a  single  control  unit  with  access 
to  a  large  random-access  memory.  This  memory  should  contain 
10^  words  or  more.  If  less  than  10''  words  are  available  in  the 
primary  memory,  there  will  probably  be  too  frequent  occasions 
for  transfer  of  information  between  primary  and  secondary  storage 
to  make  the  system  profitable. 

The  operation  of  the  computer  is  entirely  nonarithmetic,  there 
being  no  arithmetic  unit.  Since  arithmetic  processes  are  not  used 
as  the  basis  of  control,  as  they  are  in  standard  computers,  such 
a  unit  is  inessential,  although  it  would  be  highly  desirable  for  the 
computer  to  have  access  to  one  if  it  is  to  be  given  arithmetic  tasks. 
The  computer  is  perfectly  capable  of  proving  theorems  in  logic 
or  playing  chess  without  an  arithmetic  adjunct. 

Memory 

The  memory  consists  of  cells  containing  words  of  fixed  length. 
Each  word  is  divided  into  two  parts,  a  symbol  and  a  link.  The 
entire  memory  is  organized  into  a  list  structure  in  the  following 
way.  The  link  is  an  address;  if  the  link  of  a  word  a  is  the  address 
of  word  b,  then  b  is  adjacent  to  a.  That  is,  the  link  of  a  word 
in  a  simple  list  is  the  address  of  the  next  word  in  the  list. 

The  symbol  part  of  a  word  may  also  contain  an  address,  and 
this  may  be  the  address  of  the  first  word  of  another  list.  As  indi- 
cated earlier,  the  entire  topology  of  the  memory  is  determined 
by  the  links  and  by  addresses  located  in  the  symbol  parts  of  words. 
The  links  permit  the  creation  of  simple  lists  of  symbols;  the  links 
and  symbol  parts  together,  the  creation  of  branching  list  structures. 

The  topology  of  memory  is  modified  by  changing  addresses  in 
links  and  symbol  parts,  thereby  changing  adjacency  relations 
among  words.  The  modification  of  link  addresses  is  handled 
directly  by  various  list  processes  without  the  attention  of  the 
programmer.  Hence,  the  memory  can  be  viewed  as  consisting  of 
symbol  occurrences  connected  together  by  mechanisms  or  struc- 
ture whose  character  need  not  be  specified. 

The  basic  unit  of  organization  is  the  list,  a  set  of  words  linked 
together  in  a  particular  order  by  means  of  their  link  parts,  in  the 


way  previously  explained.  The  address  of  the  first  word  in  the 
sequence  is  the  name  of  the  list.  A  special  terminating  symbol  T, 
whose  link  is  irrelevant,  is  in  the  last  word  on  every  list.  A  simple 
list  is  illustrated  in  Fig.  1;  its  name  is  Lj^q,  and  it  contains  two 
symbols,  Sj  and  Sj. 

The  symbols  in  a  list  may  them.selves  designate  the  names  of 
other  lists.  (The  symbols  themselves  have  a  special  format,  so  that 
they  are  not  names  of  lists  but  designate  the  names  in  a  manner 
that  will  be  described.)  Thus,  a  list  may  be  a  list  of  lists,  and  each 
of  its  sublists  may  be  a  list  of  lists. 

An  example  of  a  list  structure  is  shown  in  Fig.  2.  The  name 
of  the  list  structure  is  the  name  of  the  main  list,  L^qq.  L.,q()  contains 
two  sublists,  L300  and  Ljqq,  plus  an  item  of  information,  that 
is  not  a  name  of  a  list.  L300  in  its  turn  consists  of  item  /j  plus 
another  sublist,  L^qq,  while  L^q„  contains  just  information,  and  is 
not  broken  out  further  into  sublists.  Each  of  these  lists  terminates 
in  a  word  that  holds  the  symbol  T. 

Availabh  space  list 

A  list  uses  a  certain  number  of  cells  from  memory.  Which  cells 
it  uses  is  unimportant  as  long  as  the  right  linkages  are  set  up.  In 
executing  programs  that  continually  create  new  lists  and  destroy 
old  ones,  two  requirements  arise.  When  creating  a  list,  cells  in 
memory  must  be  foimd  that  are  not  otherwise  occupied  and  so 
are  available  for  the  new  list.  Conversely,  when  a  list  is  destroyed 
(when  it  is  no  longer  needed  in  the  system)  its  cells  become  avail- 
able for  other  uses,  but  something  must  be  done  to  gain  access 
to  these  available  cells  when  they  are  needed. 

The  device  used  to  accomplish  these  two  logistic  functions  is 
the  available  space  list.  All  cells  that  are  available  are  linked 
together  into  the  single  long  list.  Whenever  cells  are  needed,  they 
are  taken  from  the  front  of  this  available  space  list:  whenever  cells 
are  made  available,  they  are  inserted  on  the  front  of  the  available 
space  list  just  behind  the  fixed  register  that  holds  the  link  to  the 
first  available  space.  The  operations  of  taking  cells  from  the  avail- 
able space  list  and  returning  cells  to  the  available  space  list  in- 
volve, in  each  case,  only  changes  of  addresses  in  a  pair  of  links. 


Si 

— ► 

S2 

— ► 

T 

Fig.  1.  A  simple  list. 


Chapter  30  |  A  command  structure  for  complex  information  processing  353 


L300 

l4 

L500 

T 

T 


l5 

l6 

T 

L30O      '1  1-400  ~*  T 


T 


L400 


Fig.  2.  A  list  structure. 


Organization  of  central  unit 

Figure  3  shows  the  special  registers  of  the  machine  and  the  main 
information  transfer  paths.  Four  addressable  registers  accomplish 
fixed  functions.  These  are  shown  as  part  of  the  main  memory,  but 
would  be  fast  access  registers. 

Communication  list.  L„.  The  system  allows  the  introduction  of 
unlimited  numbers  of  processes  with  variable  numbers  of  inputs 
and  outputs.  The  communication  of  inputs  and  outputs  among 
processes  is  centralized  in  a  communication  list  with  known  name, 
Lg.  All  subroutines  find  their  inputs  on  this  list,  and  all  subroutines 
put  their  outputs  on  the  same  list. 

Available  space  list.  Lj.  All  cells  not  currently  being  used  are  on 
the  available  space  list:  cells  can  be  obtained  from  it  when  needed 
and  are  returned  to  it  when  they  are  no  longer  being  used. 

List  of  current  instruction  oddrcsscs  {CIA),  L.,.  At  any  given 
moment  in  working  sequentially  through  a  program,  there  will  be 
a  whole  hierarchy  of  instnictions  that  are  in  process  or  interpreta- 
tion, but  whose  interpretation  has  not  been  completed.  These  will 
include  the  instruction  currently  being  interpreted,  the  routine 
to  which  this  instmction  belongs,  the  superroutine  to  which  this 
routine  belongs,  and  so  on.  The  CIA  list  is  the  list  of  addresses 
of  this  hierarchy  of  routines.  The  first  SNonbol  on  the  list  gives  the 
address  of  the  instmction  currently  being  interpreted;  the  second 
symbol  gives  the  address  of  the  current  instmction  in  the  next 
higher  routine,  etc.  In  this  system  it  proves  to  be  preferable  to 


keep  track  of  the  current  instmction  being  interpreted,  rather  than 
the  next  one. 

List  of  current  C/A  lists,  L3.  The  control  sequence  is  complicated 
in  this  computer  by  the  existence  of  numerous  programs  which 
become  active  when  called  upon,  and  whose  processing  may  be 
interspersed  among  other  processes.  Hence,  a  single  CI.\  list  does 
not  suffice;  there  must  be  such  a  list  for  each  program  that  has 
not  been  completely  executed.  Therefore,  it  is  necessary  also  to 
have  a  list  that  gives  the  names  of  the  CI.\  lists  that  are  active. 
This  list  is  Ly. 

Besides  these  special  addressable  registers,  three  nonaddress- 
able  registers  are  needed  to  handle  the  transfers  of  information. 
Two  of  these,  and  R.^,  are  each  a  full  word  in  length,  and 
transfer  information  to  and  from  memory.  Register  fi,  receives 
input  from  memory;  flo  transmits  output  to  memory.  The  com- 
parator that  provides  the  information  for  all  tests  takes  as  its  input 
for  comparison  the  symbols  in  Rj  and  fio.  This  pair  of  registers 
also  perfomis  a  secondary  function  in  regenerating  words  in 
memorv:  the  basic  "read"  operation  from  memory  is  assumed  to 
be  destmctive;  a  nondestructive  "read"  merely  shunts  the  word 
received  from  memory  in  fi,  to  fi.,  and  back,  by  means  of  a  "write" 
operation,  to  the  same  memory  cell. 

.•\  register.  A.  which  holds  a  single  address,  controls  references 
to  the  memory,  that  is,  sjjecifies  the  memory  address  at  which  a 
"read"  or  "write  "  operation  is  to  be  performed.  References  to  the 
four  addressable  registers,  L„  to  L3,  can  be  made  either  by  A 
or  directly  hv  the  control  unit  itself;  other  memory  cells  can  be 
referred  to  only  bv  A.  Finally,  the  computer  has  a  single  bit  register 
which  is  used  to  encode  and  retain  test  results. 


Comporator 


Communicotion  list 


Avoilobte  spoce  list 


Memory 


Fig.  3.  Machine  information  transfer  paths. 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


The  environment 

How  input-output,  secondary  storage,  and  high-speed  arithmetic 
could  be  handled  with  such  a  machine  will  be  indicated.  The 
machine  manipulates  symbols:  it  can  construct  complex  stnictures, 
search  them,  and  tell  when  two  symbol  occurrences  are  identical. 
These  processes  are  sufficient  to  play  chess,  prove  theorems,  or 
do  most  other  tasks.  The  symbols  it  manipulates  are  not  "coded"; 
they  simply  form  a  set  of  arbitrary  distinguishable  entities,  like 
a  large  alphabet. 

This  computer  can  manipulate  things  outside  itself  if  hardware 
is  provided  to  make  some  of  its  svmbols  refer  to  outside  objects, 
and  other  symbols  refer  to  operations  on  these  objects.  It  could 
do  high-speed  arithmetic,  for  example,  if  some  of  its  symbols  were 
names  of  words  in  memory  encoded  as  numbers  as  in  the  usual 
computer  fashion,  and  others  were  names  of  the  arithmetic  opera- 
tions. In  such  a  scheme  these  words  would  not  be  in  the  IPL 
language;  they  would  have  some  format  of  their  own,  either  fi.ved 
or  floating-point,  binary  or  decimal.  Thev  might  occupy  the  same 
physical  memory  as  that  used  by  the  control  computer.  Thus  the 
IPL  language  would  deal  with  numbers  at  one  remove,  by  their 
names,  in  much  the  same  manner  as  the  programmer  deals  with 
numbers  in  a  current  computer.  A  similar  approach  can  be  used 
for  manipulating  printers,  input  devices,  etc. 

The  word  and  its  interpretation 

All  words  in  IPL  have  the  same  format,  shown  in  Fig.  4.  The  word 
a  is  divided  into  two  major  parts:  the  symbol  part,  bcde,  and  the 
link,  /.  It  has  been  observed  that  the  programmer  never  deals 
explicitly  with  the  link,  although  it  will  be  frequently  represented 
explicitly  to  show  how  manipulations  are  being  accomplished. 
Since  the  same  symbol  can  appear  in  many  words,  the  symbol 
occurrence  of  the  symbol  in  the  word  a  will  be  discussed. 

A  symbol  occurrence  consists  of  an  operation,  b,  a  designation 


0 

^    \c  \  d 

v\ 

'  1 

a 

Locotion  of  word 

b 

Operolion  code 

c 

Designation  code 

d 

Address  field 

e 

Responsibility  code 

f 

L  ink  to  next  word 

Fig.  4.  IPL  word  format. 


operation,  c,  an  address,  d,  and  a  responsibility  code,  e.  The  opera- 
tion, b,  takes  as  operand  a  single  symbol  occurrence,  which  is 
called  s.  The  operand,  s,  is  determined  by  applying  the  designation 
operation,  c,  to  the  address,  d.  Thus,  the  process  determined  bv 
a  word  is  carried  out  in  two  stages:  the  first-stage  operation  (the 
designation  operation)  determines  an  operand  that  becomes  the 
input  to  the  second-stage  operation. 

The  responsibility  bit 

The  single  bit,  e,  is  an  essential  piece  of  auxiliary  information.  The 
address,  d,  in  a  symbol  may  be  the  address  of  another  list  structure. 
The  responsibility  code  in  a  symbol  occurrence  indicates  whether 
this  occurrence  is  "responsible"  for  the  stmcture  designated  by 
d.  If  the  same  address,  rf,  occurs  in  more  than  one  word,  only  one 
of  these  will  indicate  responsibility  for  d. 

The  main  function  of  the  responsibility  code  is  to  provide  a 
way  of  searching  a  branching  list  structure  so  that  every  part  of 
the  structure  will,  sooner  or  later,  be  reached,  and  so  that  no  part 
will  be  reached  twice.  The  need  for  a  definite  assignment  of 
responsibility  for  the  various  parts  of  the  structure  can  be  seen 
by  considering  the  process  of  erasing  a  list.  Suppose  that  a  list 
has  a  sublist  that  appears  twice  on  it,  but  that  does  not  appear 
anywhere  else  in  memory.  When  the  list  is  erased,  the  .sublist  must 
be  erased  if  it  is  not  to  be  lost  forever,  and  the  space  it  occupies 
with  it.  However,  after  the  sublist  has  been  erased  when  an  occur- 
rence of  its  name  is  encountered  on  the  other  list,  it  is  imperative 
that  it  not  be  erased  again  on  the  second  encounter.  Since  the 
words  used  by  the  sublist  would  have  been  returned  to  the  avail- 
able space  list  prior  to  the  second  encounter,  only  chaos  could 
result  from  erasing  it  again.  The  responsibility  code  would  indicate 
responsibility,  in  erasing,  for  one  and  onlv  one  of  the  two  occur- 
rences of  the  name  of  the  sublist. 

Detailed  consideration  of  systems  of  responsibility  is  inappro- 
priate in  this  paper.  It  is  believed  that  an  adequate  system  can 
be  constructed  with  a  single  bit,  although  a  system  that  will  handle 
merging  lists  also  requires  a  responsibility  bit  on  the  link  /.  The 
responsibility  code  is  essentially  automatic.  The  programmer  does 
not  need  to  worry  about  it  except  in  those  cases  where  he  is 
explicitly  seeking  to  modify  structure. 

Interpretation  cycle 

A  routine  is  a  list  of  words,  that  is,  a  list  of  instructions.  Its  name 
is  the  address  of  the  first  word  used  in  the  list.  The  interpretation 
of  a  program  proceeds  according  to  a  very  simple  cycle.  An  instruc- 
tion is  fetched  to  the  control  unit.  The  designation  operation  is 
decoded  and  executed,  placing  the  location  of  s  in  the  address 


Chapter  30  j  A  command  structure  for  complex  information  processing  355 


register.  A,  of  Fig.  3.  Then  operation  b  is  decoded  and  performed 
on  s.  The  cycle  is  then  repeated  using  /  to  fetch  the  next  instruc- 
tion. 

The  operation  codes 

The  simple  interpretation  cycle  previously  described  provides 
none  of  the  powerful  linguistic  features  that  were  outlined  at  the 
beginning  of  the  paper:  hierarchies  of  subroutines,  data  programs, 
breakouts,  etc.  These  features  are  obtained  through  particular  /; 
and  c  operations  that  modify  the  sequence  of  control.  The  opera- 
tion codes  will  be  explained  under  the  following  headings:  the 
designation  code,  sequence-controlling  operations,  save  and  delete 
operations,  communication  list  operations,  signal  operations,  list 
operations,  and  other  operations. 

The  designation  code 

The  designation  operation,  c,  operates  on  the  address,  (/,  to  desig- 
nate a  svmbol  occurrence,  s,  that  will  serve  as  input,  or  operand, 
for  the  operation  b.  The  designation  operation  places  the  address 
of  the  designated  symbol,  s,  in  the  address  register. 

The  designation  codes  proposed,  based  on  their  usefulness  in 
coding  with  the  IPL's,  are  shown  in  Appendix  1.  The  first  four, 
0  =  0,  I,  2,  or  3,  allow  four  degrees  of  directness  of  reference. 
They  are  usable  when  the  programmer  knows  in  advance  where 
the  symbol,  is  located.  To  illustrate  their  definition,  consider 
an  instniction  a^.  with  parts  /)],  Cj,  rf,,  and  t'j,  which  can  collec- 
tively be  called  s^.  The  address  part,  </,,  of  this  instruction  may 
be  the  address  of  another  instruction  =  a^;  the  address  part. 
^2.  of      ffi'iy      the  address  of  Qj,  etc. 

The  code  Ci  =  1  means  that  s  is  the  symbol  whose  address  is 
cZj,  that  is,  the  symbol  So-  In  this  case  the  designating  operation 
puts  the  address  of  .s.,,  in  the  address  register.  The  code  =  2 
means  that  s  is  sy.  hence,  the  operation  puts  f/.,,  the  address  of 
S3,  in  the  address  register.  The  code  t  j  =  3  puts  rfj,  the  address 
of  S4,  in  the  address  register.  Finally,  t  j  =  0  designates  as  s  the 
actual  symbol  in  flj  itself:  hence,  this  means  that  b  is  to  operate 
on  Sj.  Therefore,  this  operation  places  Oj  in  the  address  register. 

The  remaining  two  designation  operations,  c  =  4  and  5,  intro- 
duce another  kind  of  flexibility,  for  thev  allow  the  programmer 
to  delegate  the  designation  of  s  to  other  parts  of  the  program. 
When  Cj  =  4,  the  task  of  designating  s  is  delegated  to  the  symbol 
of  the  word  d^  =  Oj.  In  this  case,  s  is  found  bv  applying  the 
designation  operation,  C2  of  word  Oj,  to  the  address,  rfo,  of  word 
o,.  An  operation  of  this  kind  permits  the  programmer  to  be 
unaware  of  the  way  in  which  the  data  are  arranged  structurally 


in  memory.  Notice  that  the  operation  permits  an  indefinite  number 
of  stages  of  delegation,  since  if  c,  =  4,  there  will  be  a  further 
delegation  of  the  designation  operation  to  C3  and  d^  in  word  Oj. 

The  last  designation  operation,  c  =  5,  provides  both  for  dele- 
gation and  a  breakout.  W  ith  =  5,  t/j  is  interpreted  as  a  process 
that  determines  s.  Any  program  whatsoever,  having  its  initial 
instniction  at  f/,,  can  then  be  written  to  specify  s.  When  this 
program  has  been  executed,  an  s  will  have  been  designated,  and 
the  interpretation  will  continue  by  reverting  to  the  original  cycle, 
that  is,  by  applying  fo,  to  the  s  that  was  just  designated.  It  is 
necessary  to  provide  a  convention  for  communicating  the  result 
of  process  (/,  to  the  interpreter.  The  convention  used  is  that  f/, 
will  leave  the  location  of  s  in  L„.  the  standard  communication  cell. 

Sequence-controlling  operations 

Appendix  2  lists  the  35  b  operations.  The  first  12  of  these  are  the 
ones  that  affect  the  sequence  of  control.  They  accomplish  5  quite 
different  hmctions:  executing  a  process  (b  —  1,  10),  executing 
variable  instnictions  (b  =  2),  transferring  control  within  a  routine 
(b  =  3,  4,  .5),  transferring  control  among  parallel  program  stnic- 
tures  (/)  =  (I.  6,  7.  S,  9,),  and.  finally,  stopping  the  computer 
(b  =  11). 

.\  routine  is  a  list  of  instructions;  its  name  is  the  address  of 
the  first  word  in  the  list.  To  execute  a  routine,  its  name  (i.e.,  its 
name  becomes  the  s  of  the  previous  section)  is  designated  and  to 
it  is  applied  the  operation  /)  =  1,  "execute  s."  The  interpreter 
must  keep  track  of  the  location  of  the  instruction  that  is  being 
executed  in  the  current  routine  and  return  to  that  location  after 
completing  the  execution  of  the  instruction  (which,  in  general,  is 
a  subroutine).  .\11  lists  end  in  a  word  containing  b  =  10,  which 
terminates  the  list  and  returns  control  to  the  higher  routine  in 
which  the  subroutine  just  completed  occurred.  (The  symbol  T  is 
really  an\  symbol  with  b  =  10.) 

Figure  5  provides  a  simple  illustration  of  the  relations  between 
routines  and  their  subroutines.  In  the  course  of  executing  the 
routine  Ljq  (i.e.,  the  instructions  that  constitute  list  Ljq),  an  in- 
struction, (1,0,  L20),  is  encountered  that  is  interpreted  as  "execute 
Lofi-  In  the  course  of  executing  L20,  an  instruction  is  encountered 
that  is  interpreted  as  "execute  L^f,."  .Assuming  that  Lj,,  contains 
no  subroutines,  its  instructions  will  be  executed  in  order  until  the 
terminate  instniction  is  reached.  Because  of  the  10  in  its  b  part, 
this  instruction  returns  control  to  the  instruction  that  follows  L^g 
in  Lo,,.  When  the  final  word  in  L^^  is  reached,  the  operation  code 
10  in  its  b  part  returns  control  to  which  then  continues  with 
the  instruction  following  L20.  (Only  the  b  part,  b  =  10,  of  the 
terminal  word  in  a  routine  is  used  in  the  interpretation:  the  c  and 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


Fig.  5.  A  simple  subroutine  hierarchy. 


d  parts  are  irrelevant.)  This  is  a  standard  subroutine  linkage,  but 
with  all  the  sequence  control  centralized. 

The  operation  code  b  =  2,  "interpret  s,"  delegates  the  inter- 
pretation to  the  word  s.  The  effect  of  an  instruction  containing 
=  2  is  exactly  the  same  as  if  the  instruction  contained,  instead, 
the  symbol,  s,  that  is  designated  by  its  c  and  d  parts.  One  can 
think  of  the  instruction  with  b  —  2  as  a.  variable  whose  value  is 
s.  Thus,  a  routine  can  be  altered  by  modifying  the  symbol  occur- 
rence s,  without  any  modification  whatsoever  in  the  words  belong- 
ing to  the  routine  itself. 

The  three  operations,  b  —  3,  4,  and  5.  are  standard  transfer 
operations.  The  first  is  an  unconditional  transfer;  the  two  others 
transfer  conditionally  on  the  signal  bit.  As  mentioned  earlier,  all 
binary  conditional  processes  set  the  signal  either  "on"  or  "off." 
In  order  to  describe  operations  fo  =  0,  6,  7,  8,  9  the  concept  of 
program  structure  must  be  defined.  A  program  structure  is  a  rou- 
tine together  with  all  its  subroutines  and  designation  processes. 
Such  a  structure  corresponds  to  a  single,  although  perhaps  com- 
plex, process.  The  computer  is  capable  of  holding,  at  a  given  time, 
any  number  of  independent  program  stinctures,  and  can  interrupt 
any  one  of  these  processes,  from  time  to  time,  in  order  to  execute 
one  of  the  others.  All  of  these  structures  are  coordinate,  or  parallel, 
and  the  operations  =  0,  6,  7,  8,  9,  are  used  to  transfer  control, 
perhaps  conditionally,  fiom  the  one  that  is  currently  active  to  a 
new  one  or  to  the  previously  active  one.  In  this  sense,  the  com- 
puter being  described  mav  be  viewed  as  a  serial  control,  parallel 
program  machine. 

The  execution  of  a  particular  routine  in  program  structure  A 
will  be  used  as  an  example.  Operation  b  =  6  will  transfer  control 
to  an  independent  program  stmcture  determined  by  x,-  call  it  B. 


The  machine  will  then  begin  to  execute  B.  When  it  encounters 
a  "stop  interpretation"  operation  (b  =  0)  in  B,  control  will  be 
returned  to  the  program  structure.  A,  that  was  previously  active. 
But  the  "stop  interpretation"  operation,  unlike  the  ordinarv  ter- 
mination, b  =  10,  does  not  mark  the  end  of  program  structure  B. 
At  anv  later  point  in  the  execution  of  A,  control  may  again  be 
transferred  to  B,  in  which  case  execution  of  the  latter  program 
will  be  resumed  from  the  point  where  it  was  interrupted  by  the 
earlier  "stop  interpretation"  command.  The  operation  that  ac- 
complishes the  second  transfer  of  control  from  A  to  B  is  b  =  7, 
"continue  parallel  program  s."  Thus,  t  =  0  is  really  an  "interrupt" 
operation,  which  returns  control  to  the  previous  structure,  but 
leaves  the  stmcture  it  interrupts  in  condition  to  continue  at  a  later 
point.  There  can  be  large  numbers  of  independent  program  struc- 
tures all  "open  for  business"  at  once,  with  a  single  control  passing 
from  one  to  the  other,  determining  which  has  access  to  the  proc- 
essing facilities,  and  graduallv  executing  all  of  them.  Operations 
b  =  8  and  9  simply  allow  the  interruption  to  be  conditional  on 
the  test  switch. 

Notice  that  the  passage  of  control  from  one  structure  to  another 
is  entirelv  decentralized;  it  depends  upon  the  occurrence  of  the 
appropriate  b  operations  in  the  program  structure  that  has  control. 

When  control  is  transferred  to  a  parallel  program  structure, 
either  of  two  outcomes  is  possible.  Either  a  "stop  interpretation" 
instruction  is  reached  in  the  structure  to  which  control  has  been 
transferred,  or  execution  of  that  stmcture  is  completed  and  a 
termination  reached.  In  either  case,  control  is  returned  to  the 
program  stmcture  that  had  it  previously,  together  with  informa- 
tion as  to  whether  it  was  returned  bv  interruption  or  bv  termina- 
tion. Thus,  b  =  0  turns  the  signal  bit  on  when  it  returns  control; 
/;  =  10  in  the  topmost  routine  of  a  stmcture  turns  the  signal  off. 

The  operation,  b  =  11,  simply  halts.  Processing  continues  from 
the  location  where  it  halted  upon  receipt  of  an  external  signal, 
"go." 

Save  and  delete  operations 

The  two  operations,  b  =  12  and  13,  are  sufficiently  fimdamental 
to  warrant  extended  treatment.  For  example,  consider  a  word. 


Location 

Sifinhol 

Link 

^100 

t 

The  link  of  Lj^,,,,  t,  indicates  that  the  next  word  holds  the 
termination  operation,  b  =  10.  The  "save"  operation  (b  =  12) 


Chapter  30  |  A  command  structure  for  complex  information  processing  357 


provides  a 

copy 

of  /,  in  such 

a  way  that  /j  can  later  be 

recalled. 

even  if  in 

the  meantime  the  symbol  in  Lj,,,,  has  been 

changed. 

After  the 

"save' 

operation  h 

is  been  performed  on  s  = 

L,„„,  the 

result  is: 

Locution 

Si/mhol 

Link 

h 

Lj,u, 

h 

1 

A  new  cell,  which  happened  to  be  L.^,,,,,  was  obtained  during 
the  "save"  operation  from  the  available  space  list,  L,,  and  a  copy 
of  /[  was  put  in  it.  The  symbol  in  Lj;,,,  can  now  be  changed  without 
losing  /j  irretrievably.  Suppose  a  different  symbol  is  copied,  for 
example,      into  Ljqq.  Then: 


Location 

St/niltol 

Link 

Lmo 

I: 

L200 

Although  /j  has  been  replaced  in  L^,„,.  /,  can  be  recovered  by 
performing  the  "delete"  operation,  />  —  1.3.  Before  the  "delete" 
operation  is  explained,  it  will  be  instructive  to  show  what  happens 
when  the  "save"  operation  on  Li,,,,  is  interated.  If  it  is  executed 
again,  it  will  make  a  copy  of  L,.  Therefore: 


Location                                     .Si/m/m/  Link 

/•Kid   !•  '-.i'l" 

L3UO  ^2   ^-200 

J^200                                                             ^1   ' 


Notice  that  the  cell  Lo^o.  'n  which  the  copy  of  symbol  /j  is 
retained,  was  not  affected  at  all  by  this  second  "save"  operation. 
Only  the  top  cell  in  the  list  and  the  new  cell  from  the  available 
space  list  are  involved  in  the  transaction  of  saving.  The  same 
process  is  performed  no  matter  how  long  the  list  that  trails  out 
below  Ljoo;  thus,  the  save  operation  can  be  applied  as  many  times 
as  desired  with  constant  processing  time. 

The  "delete"  operation,  h  =  13,  applied  to  the  svmbol  L  in 
Ljgi),  will  now  be  illustrated.  This  operation  puts  the  symbol  and 
link  of  the  second  word  in  the  list,  L^^q,  into  the  first  cell,  Lj,,,,, 
and  puts  L3Q0  back  on  the  available  space  list,  with  the  following 
result: 


Location 

Sijmhot 

Link 

 h  

 ^-200 

 /,  

  t 

The  result  is  the  e.xact  situation  obtained  before  the  last  "save" 
was  performed. 

In  the  description  of  the  "delete"  operation  up  to  this  point, 
only  the  changes  it  makes  in  the  "push-down"  list,  in  this  case 
have  been  considered.  The  operation  does  more  than  this, 
however;  "delete  s"  also  era.ses  all  stnictures  for  which  the  svmbol 
.V  (/[  and  h  in  the  examples)  is  responsible.  W  hen  a  copy  of  a 
symbol  is  made,  e.g.,  the  operation  that  initially  replaced  /j  by 
/.,  in  / 

-100'  copy  is  not  assigned  responsibility  for  the  symbol 
(e  zz  0  was  set  in  the  copy).  Thus,  no  additional  erasing  would 
be  required  in  the  particular  "delete"  operation  illustrated.  If,  on 
the  other  hand,  the  /.,  that  was  moved  into  Lio,,  had  been  respon- 
sible for  the  structure  that  could  be  reached  through  it  (if  it  were 
the  name  of  a  list,  for  example),  then  a  second  "delete"  operation, 
putting  back  into  /-n,,,,  would  also  erase  that  list  and  put  all 
its  cells  back  on  the  available  space  list.  Thus  "delete"  is  also 
equivalent  to  "erase"  u  list  stnicture. 

Communication  list  operations 

In  describing  a  process  as  a  list  of  subprocesses,  the  question  of 
inputs  and  outputs  from  the  processes  has  been  entirely  by-passed. 
Since  each  subroutine  has  an  arbitrary  and  variable  number  of 
operands  as  input,  and  provides  to  the  routine  that  uses  it  an 
arbitrary  number  of  outputs,  some  scheme  of  communication  is 
required  among  routines.  The  communication  list,  L,,,  accom- 
plishes this  function  in  IPL. 

That  the  inputs  and  outputs  to  a  routine  be  symbols  is  required. 
This  is  no  real  restriction  since  a  symbol  can  be  the  name  of  anv 
list  structure  whatever.  Each  routine  will  take  as  its  inputs  the 
first  symbols  in  the  list.  That  is,  if  a  routine  has  three  inputs, 
then  the  first  three  symbols  in  L„  are  its  inputs.  Each  routine  must 
remove  its  inputs  from  L„  before  terminating  with  h  =  10,  so 
as  to  permit  the  use  of  the  communication  list  bv  subsequent 
routines.  Finally,  each  routine  leaves  its  outputs  at  the  head  of 
list  L„. 

The  b  operations  14  through  19  are  used  for  communication 
in  and  out  of  Lg.  Their  one  common  feature  is  that,  whenever  thev 
put  a  s\nibol  in  Lg,  thev  save  the  symbol  alread\'  there,  that  is, 
they  push  down  the  symbols  already  "stacked"  in  Lq.  Likewise, 
\\'henever  a  symbol  is  moved  from  Lg  to  memorw  the  svmbol  below 
it  in  Lg  "pops  up"  to  become  the  top  one.  (To  be  precise,  the 


358  Part  4  |  The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


responsibility  bit  travels  with  a  symbol  when  it  is  moved.  Hence 
for  example,  /)  =  16  and  17,  do  not,  unlike  the  "delete"  operation, 
erase  the  stnicture  for  which  ILg  is  responsible.) 

The  four  operations,  h  =  14,  15,  16,  and  17,  are  the  main  in-out 
operations  for  Lq.  Two  options  are  provided,  depending  on  whether 
the  programmer  wishes  to  retain  the  s  in  memory  (b  =  14  and 
16)  or  destroy  it  {b  =  15  and  17).  (The  move  in  operation  15  has 
the  same  significance  as  in  16  and  17;  the  responsibility  bit  moves 
with  the  symbol,  and  the  svmbol  previously  in  the  location  of  s, 
is  recalled.) 

Operation  b  =  18  is  a  special  input  to  aid  in  the  breakout 
designation  operation,  c  =  5.  Recall  that  the  latter  operation  re- 
quires d  to  place  the  location  of  s,  the  symbol  it  determines,  in 
Lq.  Operation  18  allows  the  process  d  to  accomplish  this. 

Operation  b  =  19  provides  the  means  for  creating  structures. 
It  takes  a  cell,  for  example,  Ljoo.  from  available  space,  and  puts 
its  name,  as  the  symbol  (0, 0,  Ljno).  in  the  location  of  the  designated 
symbol,  s.  The  symbol  4,  previously  in  this  location  is  pushed  down 
and  saved. 

Signal  operations 

Ten  b  operations  are  primarily  involved  in  setting  and  manipu- 
lating the  signal  bit.  Observe  that  the  test  of  equality  (b  =  20  and 
21)  is  identity  of  symbols.  Since  there  is  nothing  in  the  system 
that  provides  a  natural  ordering  of  symbols,  inequality  tests  like 
s  >  ILg,  are  impossible.  (IL,,  means  the  symbol  in  L^.)  It  is  neces- 
sary to  be  able  to  detect  the  responsibility  bit  (b  =  22),  since  there 
are  occasions  when  the  explicit  structure  of  lists  is  important,  and 
not  just  the  information  they  designate.  Finally,  although  the  signal 
bit  is  just  a  single  switch,  it  is  necessary  to  have  two  symbols,  one 
corresponding  to  "signal  on"  and  the  other  to  "signal  off  "  {b  =  26 
and  27),  so  that  the  information  in  the  signal  can  be  retained  for 
later  use  {b  =  28  and  29). 

The  sense  of  the  signal  is  not  arbitrary.  In  general  "off  "  is  used 
to  mean  that  a  process  "failed,"  "did  not  find,"  or  the  like.  Thus, 
in  operations  b  —  6  and  7,  the  failure  to  find  a  "stop  interpreta- 
tion" operation  sets  the  signal  to  "off."  Likewise,  the  end  of  a  list 
will  by  symbolized  by  setting  the  signal  to  "off." 

List  operations 

Both  the  "save"  and  "delete"  operations  are  used  to  manipulate 
lists,  but  besides  these,  .several  others  are  needed.  The  three  opera- 
tions, b  =  .30,  .31,  .32,  allow  for  search  over  list  structures.  They 
can  be  paraphrased  as:  "get  the  referent,"  "turn  down  the  sublist," 
and  "get  the  next  word  of  the  list."  They  all  have  in  common  that 
they  replace  a  known  symbol  with  an  imknown  symbol.  This 


unknown  symbol  need  not  exist;  that  is,  the  symbol  referred  to 
may  contain  a  b  =  10  operation,  which  means  that  the  end  of  the 
list  has  been  reached.  Consequently,  the  signal  is  always  set  "on" 
if  the  symbol  is  found,  and  "off  "  if  the  svmbol  is  not  found.  One 
of  the  virtues  of  the  common  signal  is  apparent  at  this  point,  since, 
if  the  programmer  knows  that  the  symbol  exists,  he  will  simply 
ignore  the  signal.  Instruction  formats  that  provide  for  additional 
addresses  for  conditional  transfers  would  force  the  programmer 
to  attend  to  the  condition  even  if  it  only  meant  leaving  a  blank 
space  in  the  program. 

To  illustrate  how  these  search  operations  work.  Fig.  6  shows 
a  list  of  lists,  L^q^,  and  a  known  cell,  Lju,,.  Cell  Lj,,,,  contains  the 
reference  to  the  list  structure.  The  programmer  does  not  know 
how  the  list,  Ljoq,  is  referenced.  He  wants  to  find  the  last  symbol 
on  the  last  list  of  the  structure.  His  first  step  is  (30,  1,  LjQg)  which 
replaces  the  reference  by  the  name  of  the  list,  Lggg.  He  then 
searches  down  to  the  end  of  list  L300  by  doing  a  series  of  opera- 
tions: (.32,  1,  Ljuq).  Each  of  these  replaces  one  location  on  the  list 
by  the  next  one.  In  fact,  a  loop  is  required,  since  the  length  of 
the  list  is  unknown.  Hence,  after  each  "find  the  next  word"  opera- 
tion, he  must  transfer,  on  the  basis  of  the  signal,  back  to  the  same 
operation  if  the  end  of  the  list  hasn't  been  reached.  The  net  result, 
when  the  end  of  the  list  is  reached,  is  that  the  location  of  the 
last  word  on  list  Lggj,  rests  in  Ljgg.  Since  in  this  example  he  wants 
to  go  down  to  the  end  of  the  sublist  of  the  last  word  on  the  main 
list,  he  next  performs  (31,  1,  Ljof,).  This  operation  replaces  the 
location  of  the  last  word  with  the  name  of  the  last  list,  Lj,,,,.  Now 
the  search  down  the  sublist  is  repeated  until  the  end  is  again 
reached,  at  this  point  the  location  of  the  last  symbol  on  the  last 
list  is  in  Lj,,,,,  as  desired.  The  sequence  of  code  follows: 


Location  Stjmbol  Link 

h    c  d 
30,l,Lioo 

^888    32,l,LioO 

4,0,L888 

31,l,Lioo 

Lggg   32. 1  .Lloo 


The  operations,  /;  =  .33  and  .34,  allow  for  inserting  symbols  in 
a  list  either  before  or  after  the  symbol  designated.  The  lists  in 
this  system  are  one-way:  although  there  is  always  a  way  of  finding 
the  svmbol  that  follows  a  designated  symbol,  there  is  no  way  of 
finding  the  symbol  that  precedes  a  designated  symbol.  The  "insert 
before"  operation  does  not  violate  this  rule.  In  both  operations. 


Chapter  30  |  A  command  structure  for  complex  information  processing  359 


l-IOo|0.1.l-;oo    ~|  L3oo|0,0,L,oo 

L;oo|  0,0,^300      I  |0,0,Lsoo     \/^70^  ll 


p.O,L7oo 


Fig.  6.  Example  of  finding  last  item  of  last  sublist. 

33  and  34,  a  cell  is  obtained  from  the  available  space  list  and 
inserted  after  the  word  holding  the  designated  svnibol.  (This  is 
identical  with  the  first  step  of  the  "save"  operation.)  In  the  "insert 
before"  operation  (b  —  33)  the  designated  symbol,  s,  is  copied  into 
the  new  cell,  and  ILy  is  moved  into  the  previous  location  of  s. 
In  "insert  after"  (b  =  34),  the  designated  symbol  is  left  unchanged, 
and  ILq  is  moved  into  the  new  cell.  In  both  cases  1/.,,  is  moved, 
that  is,  it  no  longer  remains  at  the  head  of  the  communication 
list. 

Other  operations 

This  completes  the  account  of  the  basic  complement  of  operations 
for  the  IPL  computer.  These  form  a  sufficient  set  of  operations 
to  handle  a  wide  range  of  nonnunierical  problems.  To  do  arith- 
metic efficiently,  one  would  either  add  another  set  of  /)'s  covering 
the  standard  arithmetic  operations  or  deal  with  these  operations 
externalh'  via  a  breakout  operation  on  b  (not  formally  defined  here) 
that  would  move  a  fidl  symbol  into  a  special  register  for  hardware 
interpretation  relative  to  external  machines:  adders,  printers, 
tapes,  etc. 

The  set  of  operations  has  not  been  described  for  reading  and 
writing  the  various  parts  of  the  word:  b,  c,  d,  e.  and  f  (although 
it  may  be  possible  to  automatize  this  last  completely).  These 
operations  rarely  occur,  and  it  seemed  best  to  ignore  them  as  well 
as  the  input-output  operations  in  the  interest  of  simple  presenta- 
tion. 

Interpretation 

This  section  will  describe  in  general  terms  the  machine  interpre- 
tation required  to  carry  out  the  operation  codes  prescribed.  There 
is  not  enough  space  to  be  exhaustive,  therefore  selected  examples 
will  be  discu.ssed. 


Direct  designation  operations 

Figure  7  shows  the  information  flows  for  c  =  2,  an  operation  that 
is  typical  of  the  first  four  designation  operations.  These  flows  follow 
a  simple,  fixed  interpretation  sequence,  .\ssume  that  instruction 
(  —  ,2,  L„||,)  is  inside  the  control  unit.  The  contents  of  Lj,,,,  are 
brought  into  /{,.  the  input  register,  then  transferred  to  R.^,  the 
output  register,  and  back  to  L,,,,,  again.  The  d  part  of  /?.,  now 
contains  the  location  of  s.  and  this  location  is  transferred  from 
R;  to  the  address  register. 

Execute  subroutine  {b  =  I) 

W  hen  "execute  v"  is  to  be  interpreted,  the  address  register  already 
contains  the  location  of  s.  which  was  brought  in  during  the  first 
stage  of  the  interpretation  cycle.  L^.  the  current  instruction 
address  list  (Cl.\),  holds  the  address  of  the  instniction  containing 
the  "e.xecute"  order.  .\  "save"  operation  is  performed  on  /.._„  and 
.v  is  transferred  into  L.,,  which  ends  the  operation.  The  result  is 
to  have  the  interpreter  interpret  the  first  instruction  on  the  next 
sublist,  and  to  proceed  down  it  in  the  usual  fashion.  Upon  reaching 
the  terminate  operation,  b  =  10,  the  delete  operation  is  performed 
on  IL.^.  thus  bringing  l)ack  the  original  instniction  address  from 
which  the  subroutine  was  executed.  Now,  when  the  interpretation 
cycle  is  resumed,  it  will  proceed  down  the  original  list.  Thus,  the 
two  operations,  save  and  delete,  perform  the  basic  work  in  keeping 
track  of  subroutine  linkage. 

Parallel  programs 

\  single  program  structure,  that  is,  a  routine  «ith  all  its  sub- 
routines, and  their  subroutines  etc.,  requires  a  CI.\  list  in  order 
to  keep  track  of  the  sequence  of  control.  In  order  to  have  a  number 
of  independent  program  stmctures,  a  CI.\  list  is  required  for  each. 
Lj  is  the  fixed  register  which  holds  the  name  of  the  current  CI.\ 


Fig.  7.  Information  transfers  in  c  =  2  operation. 


360  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


list.  The  name  of  the  CIA  hst  for  the  program  structure  which 
is  to  be  reactivated  on  completion  or  interruption  of  the  current 
program  stmcture  is  the  second  item  on  the  Lj  list,  etc.  Therefore, 
the  L  j  list  is  appropriately  called  the  current  CIA  list.  The  "save" 
and  "delete"  operations  are  used  to  manipulate  L3  analogously 
to  their  use  with      previously  described. 

Appendix  3  gives  a  more  complete  schematic  representation 
of  the  interpretation  cycle.  It  has  still  been  necessary  to  represent 
only  selected  b  operations. 

Data  programs 

In  the  section  on  list  operations  a  search  of  a  list  was  described. 
There  the  data  were  passive;  the  processing  program  dictated  just 
what  steps  were  taken  in  covering  the  list.  Consider  a  similar 
situation,  shown  in  Fig.  8,  where  there  is  a  working  cell,  L-^gg, 
which  contains  the  name  of  a  list,  L300.  L^gg  is  a  data  program. 
There  is  a  program  that  wants  to  process  the  data  of  L^gg,  which 
is  a  sequence  of  .symbols.  This  program  knows  Ljoq.  To  obtain  the 
first  symbol  of  data,  it  does  (6,  1,  L^gg),  that  is,  "execute  the  parallel 
program  whose  name  is  in  L^gg."  The  result  is  to  create  a  CIA 
hst,  L500,  put  its  name  in  L^gg,  and  fire  the  program.  Some  sort 
of  processing  will  occur,  as  indicated  by  the  blank  words  of  L^gg. 
Presimiably  this  has  something  to  do  with  determining  what  the 
data  are,  although  it  might  be  some  bookkeeping  on  L^gg's  experi- 
ence as  a  data  file.  Eventually  Ljgg  is  reached,  which  contains  (0. 
1,  Lggg).  This  operation  stops  the  interpretation,  and  returns  con- 
trol to  the  original  processing  program.  The  first  symbol  of  data 
is  defined  to  be  ILggg.  The  processing  program  can  designate  this 
by  4i-ioo'  since  the  sequence  of  c  =  4  prefixes  in  L^gg  and  L^gg 
pass  along  the  interpretation  until  it  ultimately  becomes  ILggg. 
Now  the  processing  program  can  proceed  with  the  data.  It  remains 


Before  8,t,Lioo 

After  8,l,L,oo 

l-loo|0,0,L3oo  1 

Lioo|0,4,L5oo  1 

L300I  1 

L50o|0,4,L700  1 

(CIA) 

L70o|0,(,L800  1 

1 

Fig.  8.  Example  of  a  data  program. 


completely  oblivious  to  the  processing  and  structure  that  were 
involved  in  determining  what  was  the  first  symbol  of  data.  Simi- 
larly, although  it  is  not  shown,  the  processing  program  is  able  to 
get  the  second  symbol  of  data  at  any  time  simply  by  doing  a 
"continue  parallel  program  iL^gg"  (b  —  7). 

One  virtue  of  the  use  of  data  programs  is  the  solution  it  offers 
for  "interpolated"  lists.  In  working  on  a  chess  program,  for  example, 
one  has  various  lists  of  men:  pawns,  pieces,  pieces  that  can  move 
more  than  one  square,  such  as  rooks,  queens,  etc.  One  would  like 
a  list  of  all  men.  There  already  exists  a  list  of  all  pieces  and  a 
list  of  all  pawns.  It  would  be  desirable  to  compose  these  lists  into 
a  single  long  list  without  losing  the  identity  of  either  of  the  short 
lists,  since  they  are  still  used  separately.  In  other  words  form  a 
list  whose  elements  are  the  two  lists,  but  such  that,  when  this  list 
of  lists  is  searched  it  looks  like  a  single  long  list.  Further,  and  this 
is  the  necessary  condition  for  doing  this  successfully,  one  cannot 
afford  to  make  the  program  that  uses  this  list  of  lists  know  the 
structure.  The  operation  "execute  s"  (b  =  1)  is  precisely  the  opera- 
tion needed  to  accomplish  this  task  in  a  data  program.  It  says  "turn 
aside  and  go  down  the  sublist  s."  Since  it  does  not  have  the  opera- 
tion b  =  0,  it  is  not  "data."  It  is  simply  "punctuation"  that 
describes  the  structure  of  the  data  list,  and  allows  the  appropriate 
symbols  to  be  designated.  Figure  9  shows  a  data  list  of  the  kind 
just  described.  The  authors  have  taken  the  liberty  of  writing  in 
the  names  of  the  chessmen. 

The  stretch  of  code  that  follows  shows  the  use  of  a  data  program 
for  a  "table  look  up"  operation.  The  table  has  arbitrary  argiuiients, 
each  of  which  has  a  symbol  for  its  value.  Aj,  Aj,  etc.  have  been 
used  to  represent  the  arguments.  To  find  the  value  corresponding 
to  argument  A5,  for  example,  Aj  is  put  in  the  communication  cell 
with  (14,  0,  A5).  Then  the  data  program  is  executed  with  (6,  0, 
Ljoo)-  Control  now  lies  with  the  table,  which  tests  each  argument 
against  the  symbol  in  the  communication  lists:  i.e.,  Aj,  and  sets 
the  signal  accordingly.  The  program  stops  interpreting  {b  =  8)  at 
the  word  holding  the  value  only  if  the  argimients  are  the  same. 
In  this  case  it  would  stop,  designating  L^^g.  If  no  entry  was  found, 
of  course,  control  would  return  to  the  inquiring  program  with  the 
signal  off. 

Location  Si/mhol  Link 

Lioo  20,0,Ai 

8,0,^300 
20,0,A2 

8,0,L320 

20,0,A5 

8,0,L3.w  f 


Chapter  30  j  A  command  structure  for  complex  information  processing  361 


Liooh.O.Lzoo      I  Hl.O.Lioo  I  T 

l-;oo|0,0,'<'"9     I  i-aoolo.O.Paw"  I 

I  0,0,  Queen  ]         I Q  0,  Pown  1 

|0,0,  K^Rookl         I QO,  Pown  1 


Fig.  9.  Application  of  a  data  program  to  cfiess. 
Conclusions 

The  purpose  of  this  paper  has  been  to  outline  a  command  structure 
for  complex  information  processing,  following  some  of  the  concepts 
used  in  a  series  of  interpretive  languages,  called  IPI^'s.  The  ulti- 
mate test  of  a  command  stnicture  is  the  complex  problems  it 
allows  one  to  solve  that  would  not  have  been  solved  if  the  coding 
language  were  not  available. 

At  least  two  different  factors  operate  to  keep  problems  from 
being  solved  on  computers:  the  difficulty  of  specification,  and  the 
effort  required  to  do  the  processing.  The  primary  features  of  this 
command  structure  have  been  aimed  at  the  specification  problem. 
The  authors  have  tried  to  specif)  the  language  requirements  for 
complex  coding,  and  then  see  what  hardware  organization  allowed 
their  mechanization.  All  the  features  of  delegation,  indirect  refer- 
encing, and  breakout  imply  a  good  deal  of  interpretation  for  each 
machine  instmction.  Similarly,  the  parallel  program  structure 
requires  additional  processing  to  set  up  CI.\  lists,  and  when  a  data 
symbol  is  designated,  there  is  delegated  interpreting  through 
several  words,  each  of  which  exacts  its  toll  of  machine  time.  If 
one  were  solely  concerned  with  machine  efficiency,  one  would 
require  the  programmer  to  so  plan  and  arrange  his  program  that 
direct  and  uniform  processes  would  suffice.  Considering  the  size 
of  current  computers  and  their  continued  rate  of  growth  toward 
megaword  memories  and  microsecond  operations,  it  is  believed 
that  the  limitation  already  lies  with  the  programmer  with  his 
limited  capacity  to  conceive  and  plan  complicated  programs.  The 
authors  certainly  know  this  to  be  true  of  their  own  efforts  to 
program  theorem  proving  programs  and  chess  playing  programs, 
where  the  IPL  languages  or  their  equivalent  in  flexibility  and  also 
in  power  have  been  a  necessary  tool. 

Considering  the  amount  of  interpretation,  and  the  fact  that 
interpretation  uses  the  same  operations  as  are  available  to  the 


programmer;  e.g.,  the  save  and  delete  operations,  one  can  think 
of  alternative  ways  to  realize  an  IPL  computer.  At  one  extreme 
are  interpretive  routines  on  current  computers,  the  method  that 
the  authors  have  been  using.  This  is  costless  in  hardware,  but 
expensive  in  computing  time.  One  could  also  add  special  opera- 
tions to  a  standard  repertoire  to  facilitate  an  interpretive  version 
of  the  language.  Probably  much  more  fruitful  is  the  addition  of 
a  small  amount  of  very  fast  storage  to  speed  up  the  interpreter. 
Finally,  one  could  wire  in  the  programs  for  the  operations  to  get 
even  more  speed.  It  is  not  clear  that  there  is  any  arrangement  more 
direct  than  the  wired  in  program  because  of  the  need  of  the  inter- 
preter to  use  the  whole  capability  of  its  own  operation  code. 

References 

Sha«  J5h;  Bern.\.58;  BrooF.57/);  Kist].57;  Newe.\.Tfi,  ^~a.  .57/),  .58 
APPENDIX  1    c  OPERATIONS  (DESIGNATING  OPERATIONS) 
c  Suture  of  operation  for  la  '  —  b  c  d  e. 

0  (at  is  the  s\inbol  s. 

1  (/  is  the  address  of  the  symbol  v. 

2  (/  is  the  address  of  the  address  of  the  symbol  s. 

3  (I  is  the  address  of  the  address  of  the  address  of  the  symbol  s. 

4  rf  is  the  address  of  the  designating  instruction  that  deter- 
mines s. 

5  (/  is  the  address  iname>  of  a  process  that  determines  s. 
APPENDIX  2    b  OPERATIONS 

b  \ature  of  operation 

Sequence-Control  Oper-vtions 

0  Stop  interpreting;  return  to  previous  program  stnicture. 

1  Execute  process  named  ,s. 

2  Interpret  instruction  s. 

.3  Transfer  control  to  location  s. 

4  Transfer  control  to  location  .s,  if  signal  is  on. 

5  Transfer  control  to  location  s,  if  signal  is  off. 

6  Execute  parallel  program  s;  turn  signal  on  if  stops;  off  if  not. 

7  Continue  parallel  program  s;  turn  signal  on  if  stops;  off  if  not. 

8  Stop  interpreting,  if  signal  is  on. 

9  Stop  interpreting,  if  signal  is  off. 

10  Terminate. 

11  Halt:  proceed  on  go. 

S.wE  .\.ND  Delete  Operations 

12  Save  s. 

1-3    Delete  s  (and  everything  for  which  s  is  responsible). 


362  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


Communication  List  Operations 

14  Copy  s  into  communication  list,  saving  ILp. 

15  Move  s  into  communication  list,  saving  ILq. 

16  Move  ILfi  into  location  of  s,  saving  s. 

17  Move  ILq  into  location  of  s,  destroying  s. 

18  Copy  location  of  s  into  communication  list,  saving  ILq. 

19  Create  a  new  symbol  in  location  of     saving  s. 

Signalling  Operations 

20  Turn  signal  on  if  s  =  ILq,  off  if  not. 

21  Turn  signal  on  if  s  =  ILq,  off  if  not;  delete  ILq. 

22  Turn  signal  on  if  s  is  responsible,  off  if  not. 
2.3    Turn  signal  on. 

24  Turn  signal  off. 

2.5  Invert  signal. 

26  Copy  signal  into  location  of  s. 

27  Copy  signal  into  location  of  s,  saving  s. 

28  Set  signal  according  to  .5. 

29  Set  signal  according  to  s;  delete  s. 

List  Operations 

.30    Replace  .s  by  the  symbol  designated  by  s,  and  turn  signal  on; 
if  symbol  doesn't  exist  {b  =  10),  leave  s  and  turn  signal  off. 

31  Replace  .s  by  the  symbol  indofs  and  turn  signal  on;  if  symbol 

doesn't  exist,  leave  ,$  and  turn  signal  off. 

32  Replace  s  by  the  location  of  the  next  symbol  after  d  of  s  and 

turn  signal  on  (s  replaced  by  "0,  4,  (f,  part  of  d  of  s)  "); 
if  next  symbol  does  not  exist,  leave  s  and  turn  signal  off. 

33  Insert  ILq  before  s  (move  symbol  from  communication  list). 

34  Insert  ILq  after  s  (move  symbol  from  communication  list). 


APPENDIX  3    THE  INTERPRETATION  CYCLE 

1.  Fetch  the  current  instruction  according  to  the  current  instruc- 
tion address  (CIA)  of  the  current  CIA  list. 

2.  Decode  and  execute  the  c  operation: 

If  c  =  3  replace  dhy  d  part  of  the  word  at  address  d,  reduce 
c  to  c  =  2  and  continue.  If  c  =  2  replace  d  by  d  part  of  the 
word  at  address  d,  reduce  c  to  c  =  1  and  continue.  If  c  =  1 
put  d  in  the  address  register  and  go  to  step  3. 
If  f  =  0  put  CIA  in  the  address  register  and  go  to  step  3. 
If  c  =  4  replace  c,  d  by  the  c,  d  parts  of  the  word  at  address 
d  and  go  to  step  2. 

If  c  =  5  mark  CIA  "incomplete,"  save  it,  set  a  new  CIA  =  d, 
and  go  to  step  1. 

3.  Decode  and  execute  the  b  operation:  (Some  of  the  b  operations 
which  affect  the  interpretation  cycle  follow.) 

If  /)  =  0  turn  the  signal  on,  delete  CIA  and  go  to  step  4. 
If  b  —  1  save  CIA,  set  a  new  CIA  =  d  part  of  s  and  go  to 
step  1. 

If  b  =  2  replace  b,  c,  d  hy  s  and  go  to  step  2. 

If  /;  =  3  replace  CIA  by  the  d  part  of  s  and  go  to  step  1. 

If  b  =  10  delete  CIA. 

If  no  CIA  "pops  up  "  turn  signal  off,  delete  CIA  and  go  to 

step  4. 

If  "popped  up  "  CIA  is  marked  "incomplete  "  fetch  the  cur- 
rent instruction  again,  move  ILq  into  address  register  and 
go  to  step  3. 
Otherwise  go  to  step  4. 

4.  Replace  CIA  by  the  f  part  of  the  current  instruction  and  go 
to  step  1. 


Chapter  31 

System  design  of  a  FORTRAN  machine^ 


Theodore  R.  Bashkow  /  Azra  Sasson  /  Arnold  Kronfeld 

Summary  A  system  design  is  given  for  a  computer  capable  of  direct 
execution  of  FORTRAN  language  source  statements.  The  allowed  types 
of  statements  are  the  FORTRAN  DO,  GO  TO,  computed  GO  TO,  Arith- 
metic, READ,  PRINT,  arithmetic  IF,  CONTINL  E,  PAI  SE,  DI.\IEN.SION 
and  END  statements.  Up  to  two  subscripts  are  allowed  for  variables  and 
no  FORMAT  statement  is  needed.  The  programmer's  source  program  is 
converted  to  a  slightly  modified  form  while  being  loaded  and  placed  in  a 
Program  Area  in  lower  memorv.  His  original  variable  names  and  statement 
numbers  are  retained  in  a  Svmbol  Table  in  upper  memory,  which  also  serves 
as  the  data  storage  area.  During  execution  of  the  program  each  FORTRA.N' 
statement  i.s  read  and  interpreted  at  ba.sic  circuit  speeds  since  the  machine 
is  a  hardware  interpreter  for  these  statements.  The  machine  corresponds 
therefore  to  a  "one-pass,  load-and-go  "  compiler  except,  of  course,  that  there 
is  no  translation  to  a  different  machine  language.  It  is  estimated  that  the 
control  circuitrv  for  this  machine  will  require  on  the  order  of  IO,(KX)  diodes 
and  UX)  (lip-Hops.  This  does  not  include  arithmetic  circuitry. 

Index  Terms  Digital  computer  system,  digital  machine  design,  direct 
execution  of  FORTRAN,  FORTRAN  computer  system,  FORTRAN'  lan- 
guage machine,  hardware  interpreter. 

Introduction 

The  algebraic  languages,  in  particular  FORTR.AN  in  this  country, 
have  had  enormous  impact  on  the  utilization  of  computers  for 
scientific  and  engineering  computation.  They  were  designed  in 
large  part  to  overcome  the  annoyance  of  lengthy  learning  time 
and  the  laborious  attention  to  detail  needed  to  use  a  basic  machine 
language. 

These  annoyances  are  overcome  by  providing  a  language  which 
is  closer  to  English  in  form,  and  freer  of  "bookkeeping"  details, 
than  the  usual  machine  languages,  and  bv  providing  a  machine 
language  program,  called  a  compiler  or  translator,  to  convert  from 
the  source  program  written  by  a  user  to  an  object  program  execut- 
able by  a  computer.  Thus  the  original  drawbacks  are  overcome 
but  the  discrepancy  between  the  external  language  of  the  user 
and  the  internal  language  of  the  machine  leads  to  at  least  two 
others.  The  compilation  mn  of  the  machine,  during  which  the 


language  translation  is  accomplished,  is  a  waste  of  time  and  money 
to  the  user  since  he  must  pav  for  this  time  though  he  gets  no 
problem  answers  from  it.  Secondly,  the  user  has  specified  the 
logical  flow  and  arithmetic  details  of  his  solution  in  the  source 
language.  However,  when  the  machine  "hangs  up  '  or  when  he 
attempts  to  debug  his  program,  all  he  finds  displayed  on  the 
machine  console  is  the  machine  language,  (On  large  machines  he 
gets  equivalently  an  esoteric  print-out  in  a  symbolic  form  of 
machine  language.)  To  overcome  these  difficulties  one  could  vise 
an  interpretive  translator  of  the  source  language  instead,  but  the 
historical  deficiencies  of  interpreters,  loss  of  memorv  space  and 
loss  of  speed  of  e.xecution  have  caused  this  solution  to  be  shunned. 

.\nother  solution  is  also  possible — design  a  machine  which 
executes  an  algebraic  language  directly  as  its  "machine  language." 
This  approach  is  based  on  a  recognition  that  once  the  allowable 
syntax  and  associated  semantics  of  language  statements  have  been 
firmly  specified  it  is  a  matter  of  choice  whether  to  write  a  compiler, 
to  write  an  interpreter  or  to  build  an  interpreter  out  of  hardware. 
The  software  choice  has  been  almost  overwhelmingK  to  write  a 
compiler.  Since  the  choice  of  hardware  interpreter,  or  machine, 
has  not  been  made,  and  in  fact  has  hardly  been  explored  to  any 
great  extent,  a  study  has  been  made  in  order  to  see  if  this  choice 
leads  to  a  system  which  is  competitive  with  the  usual  software 
system.  It  should  be  understood  that  such  a  machine  has  not  been 
constnicted.  However,  the  design-  is  sufficiently  complete  that 
construction  seems  feasible. 


Language— design  philosophy 

Since  the  machine  language  is  to  be  an  algebraic  one  it  seemed 
reasonable  to  choose  a  simple  subset  of  the  most  commonly  used 
one,  FORTRAN.  This  eliminates  the  necessity  for  inventing  still 
another  such  language  and  allows  attention  to  be  focused  on 
machine  design.  In  fact,  the  subset  chosen  is  quite  close  to  that 
known  as  Preliminarv-  FORTR.W  for  the  IBM  1620, '  which  is 
complete  enough  to  be  quite  useful,  but  which  does  not  include 


^/£££  Tratis..  EC-W,  vol.  4,  pp.  485-499,  August,  1967. 


=  See  final  technical  report  for  Contract  .\F  19(628)-279S. 


363 


364  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


such  innovations  as  subroutines,  etc.  In  addition,  the  usual  "built 
in"  subroutines  SIN  (x),  COS  (.v),  etc.,  are  not  included.  Their  in- 
clusion would  require  additional  effort  for  their  hardware  imple- 
mentation which  did  not  appear  to  be  worth  expending  at  this 
time. 

The  FORTRAN  statement  types  which  are  accepted  by  the 
machine  as  machine  language  are  in  the  table  that  follows.' 


Statement  Comment 


a  =  b  The  value  of  the  arithmetic  expression  /; 

Is  stored  in  the  memory  location  referenced 
by  the  variable  name  n,  which  may  have 
up  to  two  subscripts. 

GO  TO  II  Program  control  is  transferred  to  the 

statement  numbered  n. 

GO  TO  (rii,  n-y,  .  .  .  ,  n„,),  i     Program  control  is  transferred  to  one  of 

the  statements  numbered  n,.  112  n,„ 

depending  on  the  value  of  /  at  the  time 
this  statement  is  executed. 

IF(e)  ni,  »i2,  H3  Program  control  is  transferred  to  the 


statement  numbered  iij  if  the  algebraic 
expression  e  is  negative,  to  that  num- 
bered no  if  e  is  zero,  and  to  that  numbered 
n,-!  if  e  is  positive. 

PAUSE  Program  execution  is  halted  until  restarted 

by  console  switch. 

DO  n  i  =  Hii,  mo,  mn  All  statements  following  this  one  in  the 

program,  including  the  statement  num- 
bered n,  are  executed  repeatedly.  The 
first  execution  is  with  /  equal  mi,  i  is  in- 
cremented by  the  value  of  m^  before  each 
succeeding  execution.  This  continues  until 
i  is  greater  than  m2  at  which  time  pro- 
gram control  is  transferred  either  to  the 
statement  following  n  or  to  that  statement 
required  by  the  DO  sequencing  rules  for 
DO  nests.  If  r?i.,  is  not  given  it  is  under- 
stood to  be  1. 

CONTINUE  This  statement  has  the  effect  of  the  "no 

operation"  instruction  in  conventional 
machines.  Program  control  goes  to  the 
next  statement  in  the  program  unless  the 
CONTINUE  is  the  last  statement  in  the 
range  of  a  DO.  In  this  case  normal  DO 
sequencing  takes  place. 

END  This  statement  generates  a  control  signal 

to  start  execution  of  the  program. 

'  Some  familiarity  with  the  FORTRAN  language  is  assumed. 


READ,  List  These  statements  cause  data  to  be  read 

PRINT,  List  or  printed,  respectively,  in  accordance 

with  the  specified  list  of  variables  which 
may  be  subscripted;  however,  the  "implied 
DO"  feature  has  not  been  implemented. 
No  FORMAT  control  is  available  with  this 
machine,  therefore  no  statement  number 
need  be  given. 

DIMENSION  V.  V,  .  .  .  This  statement  has  the  effect  of  reserv- 

ing memory  space  for  the  subscripted 
variables  c.  Each  i-  stands  for  a  variable 
name  followed  by  parentheses  enclosing 
one  or  two  constants. 


No  distinction  is  made  in  this  machine  between  fixed  (integer) 
and  floating  point  (real)  variables.  These  may  have  names  of  any 
length,  starting  with  any  alphabetic  character. 

Fixed  point  constants  may  be  specified,  in  a  program  or  as  data, 
as  any  combination  of  one  to  four  numeric  characters  preceded 
by  a  -I-  or  —  sign,  however,  these  are  converted  to  an  internal 
decimal  floating  point  number  and  so  there  are  no  restrictions  on 
"mixed  mode"  expressions.  Statement  numbers  must  be  unsigned 
fixed  point  constants,  which  are  not  so  converted  since  they  only 
affect  program  control  and  not  arithmetic  processing. 

Floating  point  constants  are  specified  in  the  form  of  a  mantissa 
of  one  to  four  numeric  symbols  preceded  by  a  decimal  point  (and 
a  -I-  or  —  sign).  These  are  followed  by  the  character  E  and  a  single 
(positive  or  negative)  digit  representing  the  power  of  ten  in  the 
usual  scientific  notation. 

These  constraints  on  number  size  and  format  are  made  to 
simplify  certain  circuits  and  could  easily  be  relaxed  if  desired.  The 
restriction  to  a  two-subscript  maximum  for  subscripted  variables 
is  similarly  motivated. 

Internally,  all  numerical  data  require  three  S-bit  words  (Fig. 
1).  The  first  two  words  contain  the  four-digit  mantissa,  packed  two 
per  word  in  a  4-bit  code  for  each  digit.  A  decimal  point  is  assumed 
to  exist  to  the  left  of  the  most  significant  digit.  The  most  significant 
two  bits  of  the  third  word  are  zero.  The  third  bit  is  0  if  the 
mantissa  is  positive,  or  I  if  it  is  negative,  and  similarly  the  fourth 
bit  is  0  or  1  if  the  exponent  is,  respectively,  positive  or  negative. 
The  single  exponent  digit  occupies  the  least  significant  four  bits 
of  this  word.  All  other  characters  occupy  a  full  8-bit  word  of  which 
the  two  most  significant  are  I  s.  Any  numeric  characters  which 
are  symbols  of  a  variable,  e.g.,  the  "2"  in  AB2X,  also  occupy  a 
full  word  of  this  type.  Statement  numbers  are  simply  packed  2 
digits  per  word  and  always  occupy  2  fidl  words. 

Before  proceeding  with  the  description  of  the  overall  charac- 


Chapter  31      System  design  of  a  FORTRAN  machine  365 


+  0-5739  E-4  in  three  consecutive  words  m  memory 


Word   1  1 

0 

1 

0 

1 

0 

1 

1 

1 

Word  2  1 

0 

0 

1 

1 

1 

0 

0 

1 

Word  3  1 

0 

0 

0 

1 

0 

1 

0 

0 

\  Signs  ond  exponent 


Exponent 
Exponent  sign 
Number  sign 


Fig.  1.  Data  format  in  memory. 


teristics  of  a  machine  that  loads  and  e.xeciites  the  language  speci- 
fied above,  it  niav  be  well  to  indicate  two  basic  design  goals. 

1  The  card  deck  or  tape  containing  the  Hollerith  or  BCD 
version  of  the  English  language  form  of  a  source  program 
should  be  the  only  deck  or  tape  re<)uired  at  any  time  to 
e.xecute  the  program. 

2  Once  this  program  is  loaded  into  memory  and  execution 
started,  any  look  "into  the  machine"  should  reveal  infor- 
mation in  the  same  form  in  which  it  was  entered.  Thus  if 
the  program  is  executing  X  =  A  +  B,  then  one  should  find 
"X  ",  "  =  ",  "A",  "-I-",  "B",  at  least  in  their  BCD  form. 

The  second  goal  has  been  compromised  somewhat  as  far  as  the 
internal  representation  of  the  program  is  concerned  in  the  interest 
of  execution  speed.  However,  all  such  compromises  have  been  kept 
to  a  minimum.  In  addition,  the  mechanisms  by  which  one  can  take 
such  looks  "into  the  machine"  are  such  as  to  conceal  these  com- 
promises. 


Memory  organization 

The  machine  is,  in  effect,  a  hardware  version  of  on  "one-pass- 
load-and-go"  compiler  and  it  operates  in  two  modes.  In  the  load 
mode  FORTRAN  statements  are  read.  They  are  analyzed  as  re- 
quired and  stored  in  memory.  When  the  last  statement  has  been 
stored,  the  execution  mode  is  entered  and  program  execution 
begins  at  the  first  executable  statement  that  was  read.  The  input 
output  device  for  the  machine  design  is  a  Flexowriter  Model  SPD. 
Programs  are  assumed  to  be  punched  onto  a  paper  tape,  one 


statement  per  line,  followed  bv  a  "carriage  return"  which  gen- 
erates a  paper  tape  symbol  to  separate  statements.  When  this  tape 
is  read  into  memory,  blanks  are  automatically  "squeezed  out." 

The  memory  around  which  the  machine  is  designed  is  a  4096- 
word,  S-bit-per-word,  random-access  core  memory.'  It  is  treated 
by  the  control  circuits  as  though  it  consisted  of  three  distinct 
regions. 

1  Input  output  (I  'O)  buffer:  One  statement  at  a  time  is  loaded 
sefjuentiallv  into  meinorv  locations  0-99.  The  six-bit  paper 
tape  codes  are  first  converted  to  internal  (often  different) 
six-bit  memory  codes  and  stored  in  the  six  least  significant 
positions  of  the  8-l)it  words.  The  carriage  return  symbol  is 
encoded  into  a  special  "end-of-statement"  symbol  repre- 
sented in  the  paper  as  "i^."  When  this  symbol  is  read  the 
tape  is  also  automatically  stopped. 

2  Symbol  table  area:  Memory  locations  409.5  and  sequentially 
downward  in  memorv  hold  the  programmer's  names  for 
variables,  statement  niunbers,  etc.,  as  well  as  "pointers"  to 
machine  addresses,  plus  empty  (before  execution)  locations 
for  data. 

■3  Program  area:  Memory  locations  1(K)  and  sequentially  up- 
ward hold  the  FORTR.W  program,  in  a  slightly  modified 
form. 


Operating  modes 

The  load  mode  circuits  control  the  input  of  FORTRAN  statements. 
They  place  certain  information  in  the  Symbol  Table  .\rea  and  the 
modified  form  of  the  FORTR.\N  statements  in  the  Program  .•Vrea. 
It  is  while  in  this  mode  that  the  necessary  searches  for  variable 
names  take  place  and  machine  addresses  are  assigned.  These  ad- 
dresses replace  portions  of  the  variable  names  in  the  statement 
as  it  appears  in  the  Program  .\rea.  Similar  processing  replaces 
programmer-assigned  statement  number  references  in  the  Program 
Area  with  various  internal  "pointers  for  control  of  GO  TO.  DO, 
and  IF  statements.  This  modification  is  done  so  that  statement 
execution  in  the  execute  mode  can  proceed  at  high  speed.  In  short, 
the  FORTRAN  statement  in  the  Program  Area  is  modified  to  the 
extent  that  variable  names  are  replaced  bv  actual  data  addresses 
and  statement  number  references  are  replaced  bv  actual  addresses 
of  statement  locations  in  the  Program  .\rea.  This  translation  is 
done  once  only,  when  the  statement  is  analyzed  in  the  load  mode. 
It  might  be  noted  here  that  because  of  the  "one-pass"  nature  of 
the  translation  (a  given  statement  is  analyzed  only  once),  certain 

'■5-)is  cycle  time.  EE  Co  .Model  781. 


366  Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


of  the  pointers  correspond  to  indirect  addresses.  Figure  2  shows 
a  sketch  of  the  overall  system  control  and  Tables  2  to  7  show  to 
what  extent  the  original  statements  have  been  altered. 


Loading  a  program 

A  program,  which  is  punched  in  a  paper  tape,  is  loaded  into 
memory  by  energizing  the  tape  read  circuit  which  reads  a  state- 
ment on  the  tape,  including  the  end-of-statement  symbol  dz,  into 
the  I/O  buffer.  The  read  circuit  is  then  de-energized.  The  least 
significant  6  bits  of  each  word  of  the  buffer  hold  the  internal  BCD 
representation  of  each  symbol. 

A  scan  circuit  (Fig.  .3)  now  picks  up  each  symbol  in  the  state- 
ment from  left  to  right  and  as  each  symbol  is  decoded  it  reacts 
as  follows. 

I  If  the  first  symbol  is  a  digit,  control  is  turned  over  to  a 
Statement  Number  Load  circuit.  This  circuit  shifts  the 
statement  number  digit  by  digit  into  a  register  (SHR).  The 
maximum  allowable  length  of  a  statement  number  is  4  digits 
and  all  statement  numbers  are  carried  internally  in  this 
form,  i.e.,  a  programmer's  statement  number  13  is  carried 
in  2  words  as  0013.  A  search  is  now  made  of  the  Symbol 
Table  area.  One  of  three  possibilities  exists: 
a    The  statement  number  is  not  found  in  the  Symbol  Table. 


Memory 
oddress 
register 


Symbol  table 


Memory 


Progrom  Oreo 


I/O  butter 


Input-  Program 
output  ^ 


Memory 
butter 
register 


Arithmetic 
unit 


Read  /  print 


Fig.  2.  FORTRAN  computer  system. 


It  is  put  into  the  Symbol  Table  followed  by  the  value 
of  the  current  Program  location.  The  statement  number 
is  also  put  into  the  Program  Area  starting  at  this  location 
and  the  Program  Counter  incremented  appropriately, 
i.e.,  by  2  since  two  8-bit  words  are  used. 

h  The  statement  number  is  found  in  the  Symbol  Table 
because  it  has  been  previously  referred  to  by  an  IF  or 
GO  TO.  The  current  value  of  the  Program  Counter  is 
placed  into  the  two  memory  locations  following  the 
statement  number.  (These  were  left  blank  when  the 
statement  number  was  previously  processed.)  The  state- 
ment number  is  put  into  the  Program  Area  and  the 
Program  Counter  is  incremented. 

c  The  statement  number  is  found  in  the  Symbol  Table 
because  it  has  been  previously  referred  to  by  a  DO 
statement.  A  description  will  be  deferred  until  the  DO 
statement  loading  is  described  since  the  circuit's  behav  - 
ior is  more  meaningful  in  that  context. 

2  After  a  statement  number  has  been  processed  in  this  fashion 
or  if  the  first  symbol  in  the  statement  was  not  a  digit  (no 
statement  number  was  assigned)  then  the  scan  circuit  con- 
tinues to  pick  up  each  symbol  from  left  to  right  until  it 
is  able  to  classify  the  statement  as  to  type.  It  then  turns 
over  control  to  the  appropriate  loading  circuit  as  indicated 
in  Fig.  3. 

All  of  these  loading  circuits  put  the  statements  into  the  Pro- 
gram Area  after  replacing  variable  names  and  statement  number 
references  in  the  program  with  addresses  or  pointers.  Thev  also 
replace  reserved  names  such  as  GO  TO  or  CONTINUE  with  a 
single  8-bit  code  (token).  Each  unique  variable  name  in  the  pro- 
gram, however,  is  also  stored  in  the  Symbol  Table  once  using  an 
8-bit  code  for  each  symbol.  For  nonsubscripted  variables  the  three 
words  following  the  name  are  reserved  for  the  data  that  will  be 
associated  with  this  name  when  the  program  is  executed.  Sub- 
scripted variable  names  are  found  in  DIMENSION  statements 
which  must  precede  the  use  of  these  variables  in  the  program. 
In  this  case  as  many  locations  following  the  name  are  reserved 
as  have  been  computed  from  the  DIMENSION  .statement.  The 
name  in  the  Symbol  Table  is  preceded  by  a  special  symbol  a,  to 
indicate  that  it  is  a  subscripted  variable.  In  addition,  the  first  of 
the  two  subscript  values  in  the  DIMENSION  statement  is  also 
stored  immediately  following  the  name.  This  number  is  needed 
during  program  execution  for  constructing  the  proper  element 
of  the  array  specified  by  a  subscripted  variable.'  The  address  of 

'  A  pointer  to  the  next  available  location  in  the  Symbol  Table  is  also  stored 
for  speed  in  Symbol  Table  searching. 


Chapter  31  |  System  design  of  a  FORTRAN  machine  367 


Paper  -  Tope 


I/O 
buffer 


Process 
stotement 
number 


Scan 
CKT 


ARITHMETIC 


COMPUTED  GO  TO 


*j~~ARlTMIv' 


Process 
DO 


Process 
GO  TO 


Process 
■  COfWPUTED  GO  TO  ' 


Process 
READ 


Process 
IF 


Process 
CONTINUE 


Fig.  3.  Load  processing  sequence  and  control. 


the  data  location  replaces  all  svnibols  of  the  variable  name  in  the 
Program  Area  except  for  the  first.  This  symbol,  which  must  be 
alphabetic,  is  retained  in  the  Program  Area  as  an  indicator  that 
this  is  indeed  a  variable.  .\11  special  svmbols  such  as  (,),  +,  — , 
etc.  are  simply  stored  sequentially  in  the  Program  Area  in  the  8-bit 
BCD  form  as  they  appear  in  the  original  statement. 

Statement  numbers  in  IF  and  GO  TO  statements  are  similarly 
replaced  by  the  address  in  the  Symbol  Table  which  holds  the 
address  in  the  Program  Area  of  the  statement  having  that  number. 
Note  that  this  is  an  indirect  address  to  the  statement.  Statement 
numbers  in  DO  statements  are  dealt  with  somewhat  differentlv 
as  will  be  explained  later.  Because  variable  names  and  statement 
number  references  can  appear  many  times  in  a  program,  these 
searches  of  the  Symbol  Table  are  controlled  by  two  special  circuits, 
the  Variable  Match  Unit  (VMU)  and  the  Statement  Match  Unit 


(SMU).  These  circuits  indicate  either  that  the  name  or  statement 
number  is  already  in  the  Symbol  Table  or  it  is  not.  Thus  the  first 
appearance  of  a  variable  name,  statement  number,  or  reference 
to  a  statement  number  causes  it  to  be  put  into  the  Symbol  Table. 
Subsequent  references  merely  utilize  these  previously  assigned 
data  or  Program  adckesses.  Therefore  each  name  or  statement 
number  is  stored  in  the  Symbol  Table  only  once  with  an  exception 
noted  below.  In  general,  the  programmer's  statement  is  altered 
onlv  in  the  above  described  fashion.  However,  for  ea,se  of  execution 
the  computed  GO  TO  has  its  index  parameter  name,  i.e.,  the  "i" 
in  GO  TO  n.,.  •  •  •,  ii^),  /,  changed  from  the  position  following 
the  parenthesis  to  a  position  preceding  the  parenthesis. 

The  DO  statement  requires  the  most  complex  loading  algo- 
rithm. Basically,  the  idea  is  to  place  the  DO  statement  itself, 
essentially  unchanged,  into  the  Program  Area  but  to  extract  the 


368  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  {  Processors  based  on  a  programming  language 


range  statement  number  (which  specifies  the  last  statement  in  the 
range  of  the  DO)  and  put  it  into  the  Symbol  Table.  It  is  there 
preceded  by  a  special  symbol  A,  designating  it  as  being  referenced 
by  a  DO,  and  followed  by  the  Program  Area  address  of  the  corre- 
sponding DO  statement.  The  DO  statement  in  the  Program  Area 
has  its  original  statement  number  replaced  by  a  special  symbol, 
X,  and  an  internal  address  which  is  determined  as  follows  (see 
Table  6). 

a  If  this  DO  is  one  of  a  nest  of  DO's,  the  internal  address 
is  the  Program  Area  address  of  the  A  token  of  the  next 
preceding  DO  statement.  This  is  easily  found  by  a  Symbol 
Table  search  for  the  range  statement  number  since  there 
is  an  entry  in  the  Symbol  Table  corresponding  to  every  DO 
statement.  Thus  for  a  DO  nest  three  deep  all  ending  in 
statement  number  100,  for  example,  there  will  be  three 
entries  in  "DO  nest  order"  of  the  number  0100  each  fol- 
lowed by  the  corresponding  DO  statement  Program  Area 
address. 

b  If  this  DO  is  the  first  of  a  nest  of  DO's,  or  if  it  is  the  only 
DO  specifying  a  particular  range  statement  number,  then 
this  internal  address  is  the  program  address  of  the  ne.xt 
statement  outside  the  DO  range,  i.e.,  the  address  to  which 
control  should  go  if  this  DO  or  DO  nest  is  satisfied. 

This  outside  address  is  found  bv  the  Statement  Number  Load 
circuit  at  the  time  the  last  statement  hi  the  range  appears  in  the 
I/O  buffer  for  loading.  The  circuit  first  detects  that  a  matching 
statement  number  in  the  Symbol  Table  is  preceded  by  a  A.  It  then 
extracts  and  saves  the  Program  Area  address  of  the  first  DO  and 
the  last  DO,  if  there  is  a  nest,  or  simply  the  only  address  if  there 
is  just  one.  The  statement  number  is  put  in  the  Program  Area  as 
always.  In  addition,  the  Program  Area  address  of  the  A  token  of 
the  last  DO  in  the  nest  is  also  put  in  the  Program  Area  immediately 
following  it.  In  addition,  a  special  flip-flop,  the  LSFF,  is  set.  The 
loading  circuit  for  each  statement  type  allowed  to  be  the  last 
statement  in  a  DO  range,  tests  this  LSFF  after  it  has  loaded  the 
statement  into  the  Program  Area.  If  it  is  on,  the  current  contents 
of  the  Program  Counter,  the  address  of  the  ne.\t  statement  outside 
the  DO  range  are  used  as  the  internal  address  in  the  first  (or  only) 
DO  of  the  nest. 

It  should  be  noted  that  this  DO  range  statement  number 
together  with  its  own  Program  Area  location  will  also  appear  in 
the  Symbol  Table  without  a  preceding  A.  This  is  necessary  because 
it  is  possible  (and  even  legal  in  some  cases!)  to  have  an  IF  or 
GO  TO  refer  to  it  also. 

The  method  used  to  design  the  circuits  which  implement  these 


fimctions  is  the  same  in  each  case.  From  the  English  language 
description  of  the  fimction  a  sequential  circuit  state  diagram  is 
constructed.  The  circuit  is  then  synthesized  from  the  state  diagram 
using  established  methods.  The  state  diagrams  of  the  Arithmetic 
Statement  Loading  circuits  and  the  Variable  Match  Llnit,  which 
are  used  during  Loading,  are  shown  in  the  Appendix. 

The  hardware  implementation  of  the  state  diagram  of  the 
Variable  Match  Unit  is  also  described  there. 

Executing  a  program 

When  the  END  statement  signaling  the  end  of  a  source  program 
is  encountered  by  the  scan  unit,  the  machine  leaves  its  load  mode, 
executes  an  automatic  RESET,  and  enters  the  execution  mode. 
(Reset  forces  the  address  100  into  the  Program  Counter.)  Pressing 
the  console  start  button  causes  statement  execution  to  begin  at 
the  first  executable  statement  which  is  always  found  at  memory 
address  100.  There  is  a  separate  statement  execution  circuit  for 
each  statement  type.  In  addition,  the  Statement  Number  proc- 
essing circuit  reacts  to  a  digit  as  the  first  symbol  in  a  statement. 
Each  of  these  circuits  is  in  an  initial  state  when  execution  begins. 
One  and  only  one  can  leave  its  initial  state  when  the  first  symbol 
of  a  statement  is  read  from  memory.  The  responding  circuit  then 
retains  control  as  it  executes  the  statement  until  the  J  (end  of 
statement  symbol)  is  read  from  memory.  It  then  returns  to  its 
initial  state.  The  first  svmbol  of  the  next  statement,  as  indicated 
by  the  Program  Counter,  is  read  and  causes  some  circuit  to  leave 
its  initial  state,  etc.  Thus  the  first  svmbol  of  a  statement  acts  like 
the  "operation  code  portion  of  a  conventional  computer  instnic- 
tion  word.  The  first  symbol  must  be  (since  the  load  circuitry  causes 
this)  one  of  the  8-bit  tokens  for  the  various  statement  types,  or 
a  digit  of  a  statement  number,  or  the  alphabetic  character  of  the 
variable  on  the  left  of  the  "  =  '  symbol  of  an  arithmetic  statement. 
The  tokens  are  represented  in  this  paper  shown  in  Table  1. 


Table  1 


Statement  type 

Token 

GO  TO  )i 

GO  TO 

GO  TO  (lii.              n,„).  i 

COMGOTO 

IF  (e)  Hi,  riT,  11,1 

IF 

PAUSE 

PAUSE 

DO  n  i  =  Hii,  m-.,  nii 

DO 

CONTINUE 

CONTINUE 

READ 

READ 

PRINT 

PRINT 

Chapter  31  |  System  design  of  a  FORTRAN  machine  369 


It  is  possible,  however,  for  the  DO  execution  circuitry  to  leave 
its  initial  state  either  bv  reading  of  the  DO  or  by  reading  of  the 
X  token  immediately  following  it.  The  former  causes  DO  initial- 
ization, the  latter  causes  DO  indexing  and  testing  as  will  be 
described  later. 

The  action  of  the  execution  circuits  is  briefly  given  below. 
Statement  number  processing 

When  the  first  symbol  of  a  statement  is  a  digit  this  circuit  is 
energized.  If  there  are  only  four  digits  (packed  into  two  memory 
words)  the  circuit  returns  to  its  initial  state  and  the  remainder 
of  the  statement  is  executed.  If  there  are  eight  digits  (packed  into 
four  memory  words),  the  last  four  digits  (the  address  of  the  A  of 
the  last,  or  only,  DO  in  a  nest)  are  saved  in  a  register,  SSAR.  The 
LSFF  is  turned  on,  the  circuit  returns  to  its  initial  state  and  the 
remainder  of  the  statement  is  executed.  If  the  remainder  of  the 
statement  is  not  an  IF,  GO  TO,  or  DO  statement,  the  execution 
circuitry  in  control  executes  the  statement  and  then  tests  for  the 
LSFF  being  on.  If  it  is  on,  the  Program  Counter  contents  are  re- 
placed with  the  SSAR  contents,  the  LSFF  is  reset,  and  the  circuit 
returns  to  its  initial  state.  In  this  case  the  SSAR  holds  the  program 
address  of  the  A  token  of  the  innermost  DO.  When  this  A  is  read,  DO 
indexing  and  testing  take  place.  If  the  LSFF  is  off,  the  circuit  returns 
to  its  initial  state. 

GO  TO  n 

The  GOTO  token  energizes  this  circuit.  The  four-digit  address 
(packed  into  two  memory  words)  immediately  following  the  token 
is  extracted.  The  contents  of  this  address  are  put  into  the  Program 
Counter  and  the  circuit  returns  to  its  initial  state. 

Example}  GO  TO  lot  (Table  2). 
GO  TO  (n,,  rio,  ■     ,  nj,  i 

The  COMGOTO  token  energizes  this  circuit.  The  initial  alpha- 
betic symbol  of  i,  now  immediately  following  the  token,  is  read 
and  discarded  and  the  four-digit  address  immediately  following 
is  extracted.  The  contents  of  this  address  (the  current  value  of  i) 
are  put  into  a  register  and  decremented  bv  one. 

1  If  the  result  is  zero,  the  four-digit  address  following  the 
left  parenthesis  is  extracted.  The  contents  of  this  address 
are  put  into  the  Program  Counter  and  the  circuit  returns 
to  its  initial  state. 

'AH  examples  are  written  as  though  this  statement  or  statements  were  the 
first  in  the  program. 


Table  2 


Syjtihol  table 

Program  area 

Address  contents 

Address  contents 

4095    00  iMachine  form  for 

0100  GOTO 

4094    15 /statement  15 

0101    40 1  Address  of  the  address 

4093    02  (Address  of 

0102    93 /of  Statement  15 

4092    50 /statement  15 

0103  t 

0250  OOiStatement  15 

0251  15  jin  the  program 


2  If  the  result  is  nonzero,  the  four-digit  address  following  the 
left  parenthesis  is  read  and  discarded.  The  register  is  decre- 
mented by  one  again. 

.3  If  the  result  is  zero,  the  four-digit  address  following  the  next 
comma  is  treated  as  in  1  above. 

4  If  the  result  is  nonzero,  the  four-digit  address  following  the 
next  comma  is  read  and  discarded.  The  register  is  decre- 
mented by  one  again. 

Steps  .3  and  4  above  are  repeated  until  the  register  is  zero.  If 
the  right  parenthesis  is  read  w  hile  the  register  is  nonzero  an  error 
condition  has  been  found  and  will  be  indicated. 

'  t 

Example.  GO  TO  (5,  10,  150),  ITALY-^i  (Table  3). 

iF(e)n,,  n,, 

The  IF  token  energizes  this  circuit.  The  left  parenthesis  immedi- 
ately following  the  token  is  read.  Control  is  then  given  temporarily 
to  the  .\rithmetic  Statement  execution  circuit.  The  latter  circuit 
is  forced  to  the  state  in  which  it  would  be  if  it  were  ready  to 
evaluate  an  expression  to  the  right  of  the  equal  sign  in  an  .Arith- 
metic Statement.  A  special  F/F,  the  IFFF,  is  also  set  to  1.  The 
expression  e  of  the  IF  statement  is  read  and  evaluated  until  the 
final  right  parenthesis  of  the  IF  statement  is  read.  Since  the  .Arith- 
metic Statement  circuit  was  not  allowed  to  read  the  initial  left 
parenthesis,  it  would  normally  go  to  an  error  condition  imder  these 
circumstances  of  "unbalanced"  parentheses.  However,  sensing  that 
the  IFFF  is  set  to  I,  it  resets  the  IFFF,  places  the  value  of  the 
expression  e  just  evaluated  into  the  accumulator,  returns  to  its  own 
initial  state,  and  re-energizes  the  IF  statement  circuit.  The  ac- 
cumulator is  equipped  to  sense  its  own  contents  and  energizes  one 


370  Part  4  |  The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


Example.  IF(A  -  B)  10,  20,  20ze  (Table  4). 


Table  3 


Symbol  table 

Program  area 

Address  contents 

Address  contents 

4095 

; 

0100 

COMGOTO 

4094 

T 

0101 

I 

4093 

A 

0102 

40]Address  of  the 

4092 

L 

0103 

90  jdata  for  ITALY 

4091 

\ 

0102 

( 

4090 

0103 

401  Address  of  the  address 

4089 

0104 

85  /of  Statement  5 

4088 

0105 

4087 

00  1  Representation  of 

0106 

40lAddress  of  the  address 

4086 

05  (statement  5 

0107 

81  iof  Statement  10 

4085 

02 

0108 

4084 

50 

0109 

40|Address  of  the  address 

4083 

00 

0110 

77  jot  Statement  150 

4082 

10 

0111 

) 

4081 

03]Address  of 

0112 

t 

4UC5U 

ou  ( oidiernenT  lu 

4079 

01 

4078 

50 

4077 

05 1  Address  of 

0250 

00 

4075 

53  J  Statement  150 

0251 

05 

0350 

00 

0351 

10 

0553 

01 

0554 

50 

of  three  signal  lines  depending 

on  whether  the  number  is  zero. 

positive,  or  negative.  The  IF  circuit  senses  these  lines  and  reacts 

as  follows. 

1 

If  the  accumulator  signal 

is  negative,  the  next  four-digit 

address  (n^)  is  extracted.  The  contents  of  this  address  are 

put  into  the  Program  Counter  and  the  circuit  returns  to 

its  initial  state. 

2 

If  the  accumulator  signal  is  zero. 

the  next  four-digit  address 

is  skipped  over.  The  four- 

digit  address  following  the  next 

commas  [n.,)  is  treated  as 

in  1  a 

bove. 

3 

If  the  accumulator  signal 

is  positive,  the  next  2  four-digit 

addresses  and  the  intervening  comma  are  skipped  over.  The 

four-digit  address  followin 

g  the  next  comma  (n,)  is  treated 

as  in  1  above. 

PAUSE 

The  PAUSE  token  energizes  this  circuit.  The  end  of  statement 
symbol,  t,  is  read  and  discarded.  All  execution  circuits  are  forced 
to  a  state  O'  and  automatic  reading  of  the  memory  ceases.  A 
.START  signal,  initiated  by  a  console  switch,  is  required  to  return 
these  circuits  to  state  O  and  to  initiate  memory  reading  at  the 
location  specified  by  the  current  contents  of  the  Program  Counter. 

Example.  PAUSE  t  (Table  5). 

DO  n  i  —  nil,  "'2'  "'.3  ("''        n  i  —  m^,  rrio) 
This  circuit  is  energized  (i.e.,  caused  to  leave  its  initial  state)  either 
by  a  DO  token  or  by  the  A  token.  Its  action  is  different  in  these 
two  cases  and  will  be  described  separately. 


Table  4 


Symbol  table 

Program  area 

Address  contents 

Address  contents 

4095 

A 

0100 

IF 

4094 

0101 

( 

4093 

0102 

A 

4092 

0103 

40 

4091 

B 

0104 

94 

4090 

0105 

4089 

0106 

B 

4088 

0107 

40 

4087 

00 

0108 

90 

4086 

10 

0109 

) 

4085 

031  Address  of 

0110 

401  Address  of  the  address 

4084 

50  J  Statement  10 

0111 

85  Iof  statement  10 

4083 

00 

0112 

4082 

20 

0113 

401  Address  of  the  address 

4081 

041  Address  of 

0114 

81  iof  Statement  20 

4080 

4l]statement  20 

0115 

0116 

401  Address  of  the  address 

0117 

81  iof  Statement  20 

0118 

0350 

00 

0351 

10 

0441 

00 

0442 

20 

Chapter  31  |  System  design  of  a  FORTRAN  machine  371 


Table  5 


Symbol  table 

Program  area 

Address  contents 

(not  applicable) 

0100  PAUSE 

0101  t 

1  The  circuit  is  energized  hi/  the  DO  token:  The  X  token  and 
the  four-digit  address  immediatelv  following  are  read  and 
discarded.  The  initial  alphabetic  symbol  of  i  is  read  and 
discarded  and  the  four-digit  address  immediatelv  following 
is  extracted  and  saved  in  a  register  called  SAR.  The  = 
symbol  is  read  and  discarded.  The  initial  value  m,  of  this 
statement  can  he  either  purely  numeric  or  it  may  be  the 
name  of  a  variable. 

a  If  it  is  purely  niuneric  the  load  circuitry  will  have  re- 
placed it  with  the  internal  machine  representation  of 
the  number.  Therefore  this  munber  is  simply  read  and 
stored  in  the  Symbol  Table  starting  at  the  address  given 
in  the  SAR  register. 

/)  If  it  is  tlie  name  of  a  variable,  the  initial  alphabetic 
symbol  is  read  and  discarded.  The  four-digit  address 
following  is  extracted.  The  contents  of  this  address  are 
treated  as  in  a  above. 

In  either  event  then,  i,  is  given  the  value  m^  as  reijuired. 
The  remainder  of  the  DO  statement  including  the  t  sym- 
bol is  read  and  discarded  and  the  circuit  returns  to  its 
initial  state. 

2  The  circuit  is  energized  hij  the  \  token:  The  four-digit 
address  immediatel)'  following  is  extracted  and  saved  in  the 
SSAR.  The  initial  alphabetic  s\inbol  of  i  is  read  and  dis- 
carded, the  four-digit  address  immediately  following  is  put 
into  the  S.\R  and  the  contents  of  this  address  are  placed 
in  the  accumulator.  (This  is  the  current  value  of  i.)  The  = 
symbol  and  all  symbols  up  to  and  including  the  next  comma 
are  read  and  discarded.  The  final  value,  may  be  numeric 
or  the  name  of  a  variable. 

a  If  it  is  numeric,  this  value  is  placed  in  a  numeric  register, 
SHR. 

b  If  it  is  the  name  of  a  variable  the  initial  symbol  is  read 
and  discarded.  The  contents  of  the  four-digit  address 
following  is  e.xtracted  and  placed  in  the  numeric  register 
SHR.  The  next  symbol  is  read.  This  will  be  a  comma 
if  Hi ,  has  been  specified  or  %  if  m  ,  has  not  been  specified. 

c  If  it  is  a  comma  either  the  following  purely  numeric 
value  is  added  to  the  contents  of  the  accumulator  or  the 
contents  of  the  following  four-digit  address  is  added. 


d  If  it  is  the  X  symbol  then  the  contents  of  the  accumulator 
are  incremented  by  one.  In  either  event,  after  the  current 
value  of  I  has  been  incremented  by  either  hi.j  or  one.  the 
contents  of  the  accumulator  are  put  in  the  Symbol  Table 
starting  at  the  address  given  in  the  S.\R. 

Now  the  final  value,  saved  in  the  SHR,  is  subtracted  from  the 
accumulator.  If  the  accumulator  signal  is  positive  then  the  value 
of  I  must  be  greater  than  the  final  value  of  hi.,.  Therefore  the 
address  in  the  SSAR  is  placed  in  the  Program  Coimter  and  the 
circuit  returns  to  its  initial  state.  The  address  in  the  SSAR  will 
either  be  the  address  of  the  A  token  of  a  preceding  DO  in  the 
nest  or  it  will  be  the  address  of  the  next  statement  outside  the 
DO  nest  depending  on  which  DO  statement  is  being  executed. 
If  the  accumulator  signal  is  not  positive  then  the  value  of  i  is  less 
than  or  equal  to  m.,  and  the  circuit  just  returns  to  its  initial  state. 
Thus  the  next  statement  after  the  DO  statement  will  be  executed. 

Example.  -  See  Table  6.) 

DIMENSION  B(2(),  10)t 
DO  5    /'/•  =  1,  100,  LX 
DO   5    /  =  .V,  MX 
5  A  =  B{IT,  })t 

CONTINUE 

The  COXTIXL'E  token  energizes  this  circuit.  The  |  symbol  is  read. 
If  the  LSFF  is  not  on,  the  circuit  returns  to  its  initial  state.  If 
the  LSFF  is  on,  it  is  turned  off.  The  contents  of  the  SSAR  re- 
place the  contents  of  the  Program  Counter  and  the  circuit 
returns  to  its  initial  state.  Thus  if  this  statement  is  either  not 
labeled  or  is  not  the  last  statement  in  a  DO  range,  its  execution 
has  no  effect  on  the  program.  The  example  assumes  the  usual  case 
where  it  is  the  last  statement  in  a  DO  range. 

Example.  (Table  7) 

DO  .5  /  =  1,  15()t 
5  CONTLNUEi 

READ,  list.  {PRINT,  list.) 

The  RK\D  token  energizes  this  circuit  which  then  energizes  the 
Fle.xowriter  read  circuits.  Data  from  paper  tape  is  read  into  the 
I/O  buffer  until  the  end-of  statement  symbol,  t,  is  stored.  The  data 
must  be  punched  as  one  to  four  decimal  digits  for  fixed  point 
numbers  or  one  to  four  decimal  digits  preceded  bv  a  decimal  point 
for  floating  point  numbers.  The  latter  mav  also  be  followed  bv 


Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4     Processors  based  on  a  programming  language 


Table  6 


Symbol  table 

Program  area 

Symbol  table 

Program  area 

Address  contents 

Address  contents 

Address  contents 

Address  contents 

4095 

a 

0100 

DO 

3467 

0131 

4094 

B 

0101 

A 

3466 

0132 

M 

4093 

34  1  Next  free  symbol 

0102 

01  [Address  of  Statement 

3465 

0133 

34 

4092 

88  1  Table  Address 

0103 

57  Jfollowing  the  DO  nest 

3464 

0134 

60 

4091 

1  Machine  form  of 
the  constant  20 

04  ) 

0104 

; 

0135 

t 

4090 

0105 

34 1  Address  of  data 

0136 

00 

4089 

0106 

81  (for  IT 

346 1  M 

0137 

05 

0107 

= 

3460 

0138 

01  (Address  of  last 

0108 

00 

3459 

0139 

21  j  DO  in  the  nest 

0109 

01 

3458 

0140 

A 

3488 

A 

0110 

04 

3457  00 

0141 

34 

3487 

00 1  Machine  form 

0111 

■ 

3456  05 

0142 

52 

3486 

05  j  Statement  5 

0112 

01 

j4Db    Oil  Address  ot 

0143 

= 

3485 

01 1  Address  of  1st 

0113 

00 

3454    36 /statement  5 

0144 

B 

3484 

01 J  DO  in  nest 

0114 

04 

3453  A 

0145 

40 

3483 

/ 

0115 

3452 

0146 

91 

3482 

T 

0116 

L 

3451 

0147 

( 

3481 

0117 

34 

3450 

0148 

/ 

3480 

0118 

77 

0149 

34 

3479 

0119 

t 

0150 

81 

3478 

L 

0120 

DO 

0151 

3477 

0121 

A 

0152 

/ 

3476 

0122 

01  i  Address  of  preceding 

0153 

34 

3475 

0123 

01  ]dO  in  the  nest 

0154 

68 

3474 

^ 

0124 

; 

0155 

) 

3473 

00 

0125 

0156 

t 

3472 

05 

0126 

68 

0157 

3471 

01|Address  of  2nd 

0127 

3470 

2i1dO  in  nest 

0128 

N 

3469 

; 

0129 

34 

3468 

0130 

64 

the  letter  £  and  a  single  positive  or  negative  digit  indicating  a 
power  of  ten.  Numbers  must  be  separated  bv  a  comma  to  distin- 
guish them,  since  no  FORMAT  information  is  available  and  the 
read  circuits  "squeeze  out"  blanks. 

The  first  set  of  digits  starting  at  the  beginning  of  I/O  buffer, 
memory  address  0,  is  read  into  a  24-bit  register  (which  is  the  size 
of  the  three  8-bit  memory  words  required  for  data).  Numerical 
information  in  the  I/O  buffer  is  in  a  6-bit  code.  The  two  most 
significant  bits  are  0  if  the  code  is  for  a  numeric  character.  The 
placing  of  information  into  the  24-bit  register  is  easier  to  under- 
stand if  we  consider  it  as  a  16-bit  mantissa  register  A/,  which  can 
hold  four  decmial  digits,  and  an  8-bit  sign  and  exponent  register 
X,  which  can  hold  2  bits  of  sign  information  and  an  exponent  digit. 


Both  registers  are  set  to  zero  initially.  If  the  first  character  is  a 
minus  sign,  the  bit  in  the  mantissa  sign  position  of  .V  is  set  to  one. 
(The  internal  form  of  data  representation  was  described  earlier 
in  the  .section  on  Language-Design  Philosophy.)  If  it  is  a  plus  sign 
no  action  is  required  since  a  zero  in  the  mantissa  sign  position 
indicates  a  positive  mantissa.  Further  action  depends  on  the  next 
character. 

1  If  the  next  character  is  numeric  (or  if  there  was  no  sign 
given  and  the  first  character  is  numeric)  this  must  be  a  fixed 
point  constant.  The  four  bits  of  numeric  information  are 
gated  to  the  least  significant  four  positions  of  register  M. 
If  the  next  character  is  numeric,  M  is  shifted  left  four  posi- 
tions and  this  character  is  also  gated  to  the  least  significant 


Chapter  31      System  design  of  a  FORTRAN  machine  373 


Table  7 

Syiiihul  table 

Program  area 

Address  contents 

Address  contents 

4095 

A 

0100 

DO 

4094 

00 

0101 

A 

4093 

05 

0102 

01 

4092 

01 

0103 

22 

4091 

01 

0104 

/ 

4090 

/ 

0105 

40 

4089 

0106 

89 

4088 

0107 

4087 

0108 

00 

4086 

00 

0109 

01 

4085 

05 

0110 

04 

4084 

01 

0111 

4083 

16 

0112 

01 

0113 

50 

0114 

04 

0115 

0116 

00 

0117 

05 

0118 

01 1  Address  of  the 

0119 

01  ]  DO  statement 

0120 

CONTIWK 

0121 

0122 

position.  This  continues  until  the  comma  is  read.  The  nu- 
meric code  for  four  is  now  gated  to  the  least  significant  four 
positions  of  X.  Since  the  arithmetic  unit  assumes  a  decimal 
point  at  the  left  of  all  data,  this  action  insures  that  a  fixed 
point  number  is  properly  interpreted. 

2  If  the  next  character  after  the  sign  (if  there  is  one)  is  a 
decimal  point  this  must  be  a  floating  point  number.  In  this 
case  the  following  digits  are  stored  into  M  as  indicated 
above,  but  three  shifts  of  M  are  always  taken,  whether  or 
not  four  digits  are  stored  in  A/.  This  is  required  to  insure 
proper  interpretation  of  the  number.  If  a  comma  follows 
the  series  of  digits  no  further  action  is  taken.  If  an  £  follows 
then  the  digit  following  it  is  placed  in  the  least  significant 
4  positions  of  X.  If  a  minus  sign  is  found  following  the  E 
a  setting  of  the  exponent  sign  position  of  X  precedes  this 
action.  The  comma  is  then  read. 

After  this  first  piece  of  data  has  been  placed  in  M  and  X.  the 
alphabetic  character  following  the  READ  token  is  read  and  dis- 
carded. The  next  4  digits  are  used  as  the  address  in  which  the 


most  significant  two  digits  in  ,\/  are  stored  and  it  is  then  decre- 
mented appropriately  to  store  the  remainder  of  the  data. 

The  remaining  data  in  the  I/O  buffer  are  then  stored  one  by 
one  in  sequence  at  the  addresses  given  by  the  remainder  of  the 
READ  list.  A  subscripted  variable  on  this  list  requires  additional 
arithmetic  operations  to  compute  the  correct  address  from  the 
current  index  values  and  the  original  DIMENSION  information 
stored  in  the  Symbol  Table.  These  operations  will  be  given  later 
in  the  Arithmetic  Statement  description. 

When  the  t  token  in  the  I/O  buffer  is  reached,  the  next  char- 
acter in  the  READ  list  is  read.  If  this  character  is  also  the  t  token 
then  the  circuit  returns  to  its  initial  state.  If,  however  it  is  not, 
then  the  Flexowriter  is  again  energized  such  as  to  read  data  into 
the  I/O  buffer,  and  processing  proceeds  as  before  until  reading 
of  the  t  of  the  RE.\D  statement  returns  the  circuit  to  its  initial 
state. 

The  PRINT  statement  circuit  operates  in  almost  exactly  inverse 
fashion  and  will  not  be  described  in  detail.  The  list  variables  are 
used  in  sequence  to  extract  data  from  the  proper  memory  locations 
and  place  it  in  the  .\/  and  X  registers.  The  contents  of  these  regis- 
ters are  then  put  setiuentiallv  into  the  I  O  buffer,  together  with 
6-bit  codes  for  the  decimal  point,  plus  and  minus  signs,  commas, 
and  the  E  svmbol  at  appropriate  places.  .\11  data  are  thus  output 
in  floating  point  form,  \\  hen  the  t  token  is  read,  the  Flexowriter 
print  circuits  are  energized  and  the  circuit  returns  to  its  initial 
state. 

Example. 

READ,  A,  B.  C(I,  J)t 
PRINT,  B,C{I,J)±t 

The  appearance  of  the  Symbol  Table  and  Program  Area  should 
be  apparent  from  previous  examples.  Since  this  would  add  little 
to  the  description  of  circuit  action  they  will  be  omitted. 

a  ^  b 

The  Arithmetic  Statement  execution  unit  is  energized  bv  any  8-bit 
alphabetic  character  code.  This  first  character  of  the  variable  name 
represented  above  as  "a"  is  discarded.  Then  either  the  following 
four-digit  data  address  is  saved  or  the  data  address  of  a  subscripted 
variable  is  computed  and  saved  in  a  register.  ,\fter  reading  and 
discarding  the  =  symbol,  the  circuit  executes  the  expression  b 
in  accordance  with  the  given  sequence  of  arithmetic  operator 
symbols,  +.  — ,  *.  ,  which  are  used  to  control  the  arithmetic 
unit.  The  partial  results  at  any  time  during  the  execution  are  stored 
in  the  I/O  buffer  area  which  is,  of  course,  otherwise  unused  during 


374  Part  4  I  The  instruction-set  processor  level;  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


Arithmetic  Statement  execution.  These  storage  areas  for  partial 
results  are  called  cl^^,  rf^j,  where  i  specifies  the  "level"  at  which 
computation  is  taking  place,'  i  is  equal  to  zero  until  a  left  paren- 
thesis is  encountered  which  increases  the  current  value  of  i  by 
1.  An  exception  occurs  if  the  left  parenthesis  immediately  follows 
the  =  symbol.  In  this  case  the  level  remains  at  zero.  It  is  also 
necessary  to  store  control  information  which  relates  to  these  par- 
tial results. 

Two  control  values  are  required  at  every  level.  The  count  of 
left  parentheses  at  any  i  level  is  stored  as  a  number,  1-.  Before 
i  is  incremented,  the  incompleted  arithmetic  operations  still  re- 
quired at  the  current  level  are  indicated  by  giving  an  indicator 
t,  the  value  1,  2,  or  3.  Also  needed  are  indicators  t  +  ,  and  i  *  ,  to 
distinguish  -|-  from  —  and  *  from  /.  To  clarify  the  significance  of 
these  control  values  an  analysis  will  be  made  of  the  following  ex- 
pression, which  contains  some  unneeded  but  legitimate  sets  of 
parentheses: 

A^{iB  +  (c/((D  +  E'iFm  +  on 

1  The  circuit  reads  and  saves  the  address  of  A,  then  reads 
and  discards  the  —  which  puts  the  circuit  at  the  level  i  =  0. 
The  first  two  left  parentheses  cause  Iq  to  be  set  to  2.  The 
value  of  B  is  stored  in  (/dq.  The  plus  sign  followed  by  a  left 
parenthesis  cause  the  indicator  to  be  set  to  1  to  indicate 
the  condition  "B  +  (".  Since  we  might  in  other  cases  find 
"B  —  (",     is  set  to  zero  to  indicate  the  plus  sign. 

2  The  left  parenthesis  also  causes  i  to  be  incremented  to  one 
and  since  it  is  the  only  one  at  this  level,  is  also  set  to 
1.  The  value  of  C  is  stored  in  djQ.  The  division  symbol 
followed  by  a  left  parenthesis  causes  to  be  set  to  2  to 
indicate  the  condition  "C/{".  Since  we  might  find  "C*{"  in 
other  cases,  f  *  j  is  set  to  1  to  indicate  the  division. 

.3  The  left  parenthesis  also  causes  i  to  be  incremented  to  2 
and  the  next  left  parenthesis  increments  /,  to  2.  The  value 
of  D  is  stored  in  rf,,,  and  the  value  of  £  put  into  dji-  respec- 
tively. The  multiplication  symbol  followed  by  a  left  paren- 
thesis causes  to  be  set  to  3  to  indicate  the  condition 
"D  +  E*  (".  t  +  2  and  t  *  ,  are  each  set  to  zero  to  indicate 
the  plus  and  multiplication  symbols,  respectively. 

4  The  left  parenthesis  before  the  F  causes  i  to  be  incremented 
to  3  and     to  be  set  to  1.  The  value  of  F  is  placed  in  rf^n. 

The  Arithmetic  Statement  circuit  always  puts  the  final 
value  computed  at  any  level  into  the  arithmetic  unit  regis- 
ter, SR.  It  does  this  whenever  1^  =  0  for  any  i.  Clearly  /j 
must  be  decremented  by  one  for  each  right  parenthesis. 

'Basic  circuit  operation  at  any  level  is  described  in  the  earlier  report.  See 
page  .363,  footnote  2. 


Therefore  the  first  right  parenthesis  after  the  F  causes  /j 
to  equal  zero.  This  condition  causes  the  value  stored  in 
to  be  placed  in  the  SR.  The  value  of  i  is  decremented 
to  2. 

5  f,  being  3  (and  t  +  ^  =  t  *  2  =  ^)  causes  the  computation, 
don  +  *  SR  to  be  stored  in  djQ.  The  next  two  paren- 
theses after  F  cause  I2  to  equal  zero.  Therefore,  this  result 
is  placed  in  the  SR.  The  value  of  ;  is  decremented  to  1. 

6  Since  is  equal  to  2  and  t*  ^is  equal  to  1  the  computation 
f/i,i/SR  is  made  and  stored  in  d^g.  The  final  parenthesis  after 
the  F  causes  Zj  to  equal  zero.  Therefore  this  result  goes  to 
SR.  i  is  decremented  to  zero. 

7  Since  t„  is  one  and  f  -|-  q  is  zero  the  computation,  d^^  +  SR, 
is  made  and  the  result  is  stored  in  d^g. 

8  The  -I-  G  causes  the  computation  d^^  -I-  G  to  be  made  and 
stored  in  rf^,,.  The  final  two  parentheses  cause  to  be  zero; 
therefore  the  value  in  dg„  is  placed  in  SR.  (If  another  right 
parenthesis  were  found,  this  would  cause  an  error  condition 
to  be  indicated.)  The  t  symbol  causes  the  contents  of  SR 
to  be  stored  at  the  previously  saved  memory  address  for  A. 

Any  subscripted  variable  addresses  are  computed  easily  from 
the  initial  DIMENSION  statement  information,  saved  in  the  Sym- 
bol Table,  and  the  current  value  of  the  subscripts.  Assume  the  first 
data  location  for  an  array  A(/,  /)  is  stored  at  a  location  A^^^^  +  I. 
If  the  DIMENSION  statement  read  DIMENSION  A(5,  10)  then 
the  computation,  Af^^^  +  5  *  (J  —  1)  -I-  /,  gives  the  correct  data 
address  for  any  nonzero  value  of  /  and  /.  (This  is  true  only  if  a 
complete  data  word  is  stored  per  memory  word;  in  this  machine 
the  expression  is  slightly  more  complicated.) 

In  this  machine  the  partial  result  locations  djQ  and  rf,  j  are 
actually  3  words  long,  of  course,  to  accommodate  the  data.  An 
additional  word  is  used  to  store  control  information  where  4  bits 
are  used  for  t^,  t  +  ^,  and  t*  ^  and  the  remaining  4  bits  for  the 
/j  count.  The  i  counter  therefore  is  actually  incremented  or  decre- 
mented by  7  instead  of  one.  Thus  at  any  level,  of  which  there 
can  be  14  since  the  I/O  buffer  is  100  words  long,  the  /;  count 
can  be  as  great  as  15.  This  is  more  than  adequate  since  it  allows 
for  210  left  parentheses,  which  is  much  longer  than  the  I/O  buffer 
length. 

Since  the  appearance  of  the  Symbol  Table  and  Program  Area 
would  add  little  to  this  discussion,  an  example  will  be  omitted. 

Conclusion 

We  have  illustrated  in  some  detail  that  a  machine  for  direct  trans- 
lation of  a  simple  algebraic  language  is  possible.  It  would  therefore 


Chapter  31  |  System  design  of  a  FORTRAN  machine  375 


seem  that  further  investigation  be  made  of  the  economic  position 
of  this  solution  vis-a-vis  the  software  compiler  solution.  Unfor- 
tunately, the  present  authors  are  not  sufficiently  versed  in  compiler 
construction  to  make  such  a  comparison. 

The  actual  construction  of  such  a  machine  as  an  independent 
unit  is  probably  not  reasonable  except  under  particular  circum- 
stances in  which  only  small  one-shot  scientific  problems  form  the 
bulk  of  the  computing.  However,  as  an  adjunct  to  a  larger  general 
purpose  machine,  it  may  well  serve  a  need  as  a  hardware  inter- 
preter for  widely  used  higher  level  languages. 

As  a  result  of  a  fairly  complete  design  of  the  control  circuits 
of  this  machine,  it  is  estimated  that  diodes  and  1(K)  flip-flops 

would  be  needed  for  these  alone  (not  including  arithmetic  circuits). 
The  design  techniques  used  are  simple  and  straightforvvard  but 
rather  expensive.  These  designs  should  probably  only  be  consid- 
ered for  use  with  integrated  circuitry. 

References 

AndeJ61;  BashT64;  International  Business  Machines  Corporation,  General 
Information  Manual;  FOHTR.\N,  Form  F28-807401,  December,  1961;  IBM 
1620  FORTR.\N:  Preliminary  Specifications,  Fonii  J29-42{K)-2,  .\pril.  196() 

APPENDIX' 

The  variable  match  unit  (VMU)  (Fig.  4) 

The  Symbol  Table  at  the  end  of  the  load  mode  should  contain 
all  variable  names  used  by  the  program,  together  with  emptv 
locations  reserved  for  data  associated  with  these  names.  The  Pro- 
gram Area  at  the  end  of  the  load  mode  should  have  a  program 
in  which  all  variable  names  have  been  modified  in  that  only  the 
first  letter  is  retained,  followed  by  the  Sijmhol  Table  address  of 
the  data  associated  with  this  name.  Since  any  variable  name  mav 
appear  many  times  in  a  program,  a  search  is  required,  during  the 
loading,  to  see  if  the  name  alreadv  exists  in  the  Svmbol  Table. 
The  search  of  the  Symbol  Table  (ST)  consists  of  comparing  each 
name  there  with  the  variable  name  in  the  statement  being  loaded. 
All  statements  are  loaded  by  an  appropriate  circuit  of  Fig.  3  from 
the  I/O  buffer  and  into  the  Program  Area  of  the  memory.  There- 
fore the  variable  name  in  the  Statement  exists  physically  in  the 
I/O  buffer. 

It  is  the  fimction  of  the  VMU  to  make  this  search  when  ener- 
gized or  "called"  by  the  loading  circuits  for  DIMENSION,  DO, 
computed  GO  TO,  READ,  PRINT,  IF  and  Arithmetic  statements 
in  which  variable  names  appear.  The  output  action  of  the  VMU 

'Symbols  used  in  this  Appendi.\  are  described  in  Table  8. 


is  to  set  either  the  OK,  AOK  or  EOL  flip-flops.  These  flip-flops 
respectively  indicate  that  the  ST  either: 

1  holds  the  variable  in  question  as  a  result  of  previous  loading, 
or 

2  that  the  variable  is  subscripted  and  has  been  previously 
loaded  by  the  DIMENSION  statement  loading  circuit,  or 

3  that  the  End-of-List  (EOL)  token  was  found,  indicating  the 
absence  of  the  variable  in  the  ST. 

The  state  diagram  for  this  circuit  is  shown  in  Fig.  4.  W  hen 
triggered  by  the  START  VMU  signal  in  state  0,  the  circuit  goes 
to  state  1,  the  next  clock  pulse  sends  it  to  state  2  from  which  it 
starts  its  search  of  the  ST.  In  going  from  I  to  2,  the  I/O  Counter 
(CIO)  contents  are  saved  in  register  SCIO  since  the  name  may 
have  to  be  scanned  again.  The  Symbol  Table  Counter  (STC)  is 
initialized  to  4095  since  the  ST  is  scanned  sequentially  downward. 

If  a  character  of  a  variable  name  in  the  I  O  buffer  is  foimd 
in  the  corresponding  position  of  a  name  in  the  ST,  the  character 
is  said  to  be  matched.  The  VMU  proceeds  from  state  2  to  state 
3  if  the  first  character  of  the  name  under  scan  matches.  Otherwise 
the  state  changes  from  2  to  8,  if  the  .NO  MATCH  signal  is  given. 
The  .\I.\TCH  or  NO  .\I.\TCH  signals  are  generated  as  a  result 
of  comparing  the  contents  of  the  ST  location  undergoing  the  scan 
(the  contents  reside  in  the  Memory  Buffer  Register,  MBR),  with 
the  contents  of  the  register  COMP  which  has  the  character  from 
the  I/O  buffer.  The  first  character  is  put  into  COMP  by  the  calling 
circuit,  thereafter  the  VMU  picks  them  up  in  the  3-4  transition. 
The  CIO  and  STC  counters  are  incremented  and  decremented, 
respectively,  and  the  VMU  oscillates  between  states  3  and  4  as 
long  as  matching  continues.  This  comparison  process  will  termi- 
nate when,  either  an  arithmetic  operator  Sg,  is  read  from  the  I/O 
buffer  sending  the  circuit  to  state  6  from  state  3,  or  the  ST  contents 
cause  a  NO  MATCH  signal  with  respect  to  the  contents  of  the 
COMP  unit  causing  the  transition  from  state  4  to  5. 

In  state  6,  if  a  digit  is  next  read  from  the  ST,  corresponding 
in  position  to  the  appearance  of  the  operator  from  the  I  O  buffer 
clearly  the  names  are  the  same  and  the  OKFF  is  set  to  1.  and 
the  transition  from  6  to  0  is  made.  On  the  other  hand,  if  anotlier 
alphameric  character  in  the  ST  corresponds  to  an  operator,  Sq, 
in  the  I/O  buffer,  the  names  are  not  the  same  and  the  transition 
from  6  to  5  is  made.  In  state  5  the  circuit  just  reads  to  the  end 
of  the  nonmatching  name  in  the  ST.  A  digit  at  the  end  of  this 
name  causes  the  transition  5-7  during  which  the  STC  is  stepped 
over  the  3  data  locations  to  the  ne.xt  ST  entry  and  the  CIO  reini- 


376  Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


Fig.  4.  Variable  match  unit. 


tialized  to  the  start  of  the  name  being  sought.  The  first  character 
in  this  name  is  read  and  placed  in  COMP  as  circuit  goes  to  2. 

As  stated  earUer,  when  the  first  character  from  the  I/O  buffer 
does  not  match  the  contents  of  ST,  the  state  becomes  8.  If  the 
mismatch  was  caused  by  the  EOL  token  in  the  ST  the  EOLFF 


is  set  to  1  and  state  0  is  reached.  If  the  mismatch  was  due  to  a 
A  at  the  present  ST  location,  the  STC  is  decremented  by  5  which 
steps  over  the  2  four-digit  numbers  stored  after  a  A  and  the  circuit 
returns  to  2  to  try  a  match  on  the  next  ST  entry.  If  the  mismatch 
is  caused  by  a  digit  then  this  is  statement  number  information 


Chapter  31  |  System  design  of  a  FORTRAN  machine  377 


Table  8 

CIO  Counter  for  the  input  output  buffer.  4  BCD  numeric  char- 
acter (4  bits  each),  counts  up.  Can  be  set  to  any  given 
number. 

CP  Program  Counter.  (During  execution  it  points  to  the  state- 

ment to  be  executed,  during  loading  it  points  to  the  loca- 
tion where  the  program  is  to  be  loaded.)  4  BCD  numeric 
characters,  counts  up.  Can  be  set  to  any  given  value. 

COMP  Comparator  register,  8  bits.  During  loading  holds  a  char- 
acter to  be  matched  with  some  other  character  in  the 
memory,  during  execution  saves  the  Input  symbol  that 
drives  the  execution  circuits.  (Acts  as  second  rank  of 
Memory  Buffer  Register.) 

SAR  Save  Address  register,  4  BCD  numerics.  Counts  down.  Dur- 
ing loading  holds  the  address  of  the  last  DO  In  a  nest. 
During  execution  It  Is  an  auxiliary  counter. 

SAVE  2  BCD  (8  bits  total)  auxiliary  register,  each  bit  can  be  set 
Independently  of  the  others. 

SCIO        4  BCD  numeric  register,  holds  temporarily  the  value  of  CIO. 

SHR  Special  Shift  register,  4  BCD  character,  can  be  shifted  to 
the  left  1  BCD  character  (4  bits)  at  a  time. 

SR  24-blt  register,  used  with  the  accumulator  in  the  arithmetic 

unit.  Bits  1-8,  9-16,  17-24  can  be  gated  independently. 

SSAR  Special  Save  register,  4  BCD  numeric  (used  as  auxiliary 
register  in  loading  and  execution). 

STC         Symbol  Table  counter,  4  BCD  character,  counts  down. 

Sv.  The  8  bits  In  the  MBR  are  decoded  as  a  single  alphabetic 

character  (A-Z). 

Sd  The  8  bits  In  the  MBR  are  decoded  as  a  digit  (0-9)  and  bits 

1-4  represent  in  BCD  the  value  of  the  digit. 

S„  The  bits  in  the  MBR  are  decoded  as  one  of  the  following 

operators,  -f  -  *    (  ), 

a  8-blt  character  that  precedes  a  subscripted  variable  name 

In  the  Symbol  Table. 

A  8-blt  character  that  precedes  the  statement  number  of  a 

last  statement  of  a  DO  nest  In  the  symbol  table. 

\  An  8-blt  character  that  follows  the  DO  token  In  the  program 

area. 

d  The  8  bits  in  the  MBR  are  decoded  as  2  BCD  digits  of  4  bits 

each. 

EOL  An  8-blt  character  that  Is  placed  at  the  current  end  of  the 
Symbol  Table. 

MATCH  Signal  that  Is  generated  when  the  content  of  the  MBR  is 
Identical  to  the  content  of  the  COMP. 


which  requires  a  decrement  of  STC  by  4  to  get  to  the  next  entry. 
If  an  unmatched  alphabetic  character  in  the  ST  was  the  reason 
for  the  mismatch,  this  variable  is  read  to  its  end  in  state  12  as 
was  done  in  state  5. 

The  only  other  ST  symbol  which  could  have  caused  a  mismatch 
is  an  a,  the  array  symbol.  This  symbol  sends  the  VMU  to  state 
9.  If  a  match  is  now  to  occur,  it  will  be  with  a  subscripted  variable 
name.  Thus  a  match  causes  a  transition  from  9  to  1.3  and  states 
1.3  and  14  correspond  to  state  3  and  4  for  a  simple  variable  as 
matching  proceeds. 

Reading  an  arithmetic  operator  in  the  I/O  buffer  causes  transi- 
tion to  16  where  a  corresponding  digit  in  the  ST  causes  the  .\OKFF 
to  be  set  and  the  circuit  returns  to  0,  during  which  time  it  decre- 
ments the  STC.  This  is  necessary  in  order  for  the  STC  to  hold 
the  address  of  the  first  constant  given  in  the  DIMENSION  state- 
ment which  caused  this  ST  entr\'.  The  transition  16  to  15  corre- 
sponds to  the  6  to  5  transition,  the  ST  name  is  longer  than  the 
I  O  hviffer  name,  and  in  state  15  the  rest  of  the  name  is  stepped 
over.  Now,  however,  the  next  two  words  in  the  ST  hold  the  address 
of  the  next  ST  entry.  Therefore,  these  are  saved  and  put  into  the 
STC  during  transition  15-17-7,  which  otherwise  corresponds  to  the 
transition  5-7  for  a  single  variable. 

If,  however,  there  was  no  match  in  state  9,  the  circuit  steps 
over  the  rest  of  the  name  in  the  ST  in  state  10  and  initializes  the 
STC  to  the  next  ST  entry  in  the  transition  10-11-12. 

Note  that  when  the  V.VIU  returns  to  its  0  state  after  setting 
either  EOL  or  OK  or  AOK  flip-flops,  the  STC  holds  precisely  the 
address  needed  for  further  action.  An  EOL  needs  to  be  replaced, 
starting  at  this  STC  address,  with  the  new  variable  name.  In  the 
case  of  OK  or  .\OK  this  STC  address  is  the  one  to  be  placed  in 
the  program  since  it  holds  the  data  address  for  simple  variables 
or  the  address  of  the  required  indexing  constant  for  subscripted 
variables. 

.\fter  the  calling  circuit  has  used  the  VMU  it  has  received  one 
of  the  3  signals  from  the  VMU.  For  certain  statements  these  signals 
can  be  used  to  detect  syntax  errors.  If  there  are  none  then  the 
calling  circuit  takes  whatever  further  action  is  necessary  on  the 
variable  name  being  scanned. 

The  arithmetic  statement  loading  circuit  [Fig.  5) 

An  arithmetic  statement  consists  of  a  string  of  alphameric  symbols, 
S^.Sjj,  grouped  to  form  variable  names,  of  numeric  symbols,  Sj, 
grouped  to  form  constants,  and  of  arithmetic  or  other  operator  svm- 
bols,  Sg,  which  separate  them.  The  Arithmetic  Statement  loading 
circuit  calls  on  the  VMU  circuit  to  find  the  variable  names  as  has 
been  described.  It  then  puts  a  new  name  into  the  ST  lif  required) 


378  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


/SHIFT  SHR 
5d/  Sd— SHR 
/t  CIO 

'read  I/O 


'SHIFT  SHR 
'  Sd  —SHR 
't  CIO 
'READ  I/O 


„  /SET  FF2 
H/tCIO 
/ READ  I/O 


Fig.  5.  Arithmetic  statement  loading. 


or  it  puts  the  data  address  into  the  program.  The  8-bit  BCD  forms 
of  the  operator  symbols  are  simply  put  into  the  program.  The 
constants  are  put  into  the  program  after  conversion  to  machine 
form.  The  state  diagram  of  this  circuit  is  shown  in  Fig.  5.  The 
scan  circuit  signal  ARITH  STAT  sends  the  circuit  from  0  through 
1  to  2.  The  scan  circuit  has  saved  the  address  of  the  beginning 
of  this  statement  in  a  register  SCIO.  This  is  used  to  initialize  the 
CIO  so  that  this  statement  can  be  read  from  the  beginning. 

The  first  symbol  of  an  arithmetic  statement,  which  must  be  a 
variable  and  not  a  digit,  takes  the  circuit  to  state  3  after  this 
symbol  has  been  put  into  the  program  (S„  —>  PROG)  and  the  VMU 
initialized  and  started.  Any  one  of  the  VMU  signals  is  possible 
and  valid  and  simply  forces  the  circuit  to  state  5.  During  the  3-5 
transition  the  circuit  loads  the  appropriate  address  into  the  pro- 
gram when  the  name  has  matched.  If  it  has  not  matched  any 
existing  name  the  circuit  first  goes  to  state  4  and  puts  the  name 


into  the  Symbol  Table  before  going  to  state  5.  State  5  is  that  from 
which  all  further  loading  is  accomplished.  Variable  names  are 
separated  by  operators,  which  are  loaded  into  the  program  by  the 
cycle  in  state  5  (Sq  — >  PROG).  Note  the  convention  that  Sq  repre- 
sents any  operator  symbol  not  explicitly  specified  on  another  exit 
from  5.  Any  variable  names  cause  a  transition  to  state  3  with  the 
same  output  action  as  from  state  2.  Floating  point  constants  are 
loaded  via  states  5-9-5.  A  decimal  point  indicates  a  floating  point 
constant  and  takes  the  circuit  to  state  7.  (Note  that  a  minus  sign 
preceding  a  constant  is  simply  an  operator  and  is  processed  in  state 
5.)  The  SHR  is  cleared  in  preparation  for  the  storing  of  the  follow- 
ing digits  in  state  7.  When  E  is  received  the  digits  of  the  fraction 
in  the  SHR  are  left  adjusted  (ADJUST  SHR),  if  there  are  less  than 
four  of  them,  and  placed  in  the  program  area.  The  exponent  sign 
is  found  in  the  transition  8  to  9.  The  exponent  digit  together  with 
the  exponent  sign  bit  is  stored  in  the  program  area  during  the 


Chapter  31  |  System  design  of  a  FORTRAN  machine  379 


9  to  5  transition.  Fixed  point  constants  are  handled  in  state  6.  The 
important  difference  is  that  the  digits  are  not  left  adjusted  in  the 
SHR  and  a  04  is  put  into  the  program  as  the  exponent  since  a 
decimal  point  is  assumed  to  precede  the  first  data  word.  See 
Fig.  1. 

The  I  takes  the  circuit  to  its  initial  state.  If  this  statement 
happens  to  be  the  last  in  a  DO  nest,  the  Statement  Number  Load 
circuit  has  set  the  LSFF  to  1.  It  has  al.so  put  the  ST  address  of 
the  word  following  the  X  symbol  of  the  first  DO  of  the  nest  into 
the  SSAR  register.  Since  the  program  counter  (CP)  now  holds  the 
correct  exit  address  for  this  DO  statement  it  is  placed  at  the 
address  given  by  the  SSAR  during  the  transition  to  state  0.  During 
the  transition  the  signal  START  READ  is  also  sent  to  the  paper 
tape  reader  in  order  for  it  to  put  the  next  Statement  into  the  I  O 
buffer. 

Hardware  implementation  of  the  VMV  state  diagram 

Each  hmction  mentioned  in  the  paper  plus  some  other  auxiliary 
ones  are  initially  represented  in  a  state  diagram  form,  such  as  the 
state  diagram  for  the  loading  of  the  .Arithmetic  Statement 
(Fig.  5)  and  the  Variable  Match  Unit  (VMU)  (Fig.  4). 

We  will  describe  the  method  used  to  realize  a  circuit  which 
will  perform  the  function  defined  by  a  given  state  diagram  (SD). 
As  an  example  we  will  use  the  VMU.  All  the  information  needed 
is  present  on  the  SD.  The  operations  on  the  right-hand  side  of 
the  "/„  in  the  SD  are  the  output  operations  required  to  be  per- 
formed. In  order  to  implement  these  operations  we  must  specifv 
the  actual  register  gating  signals,  memory  read  and  write  signals, 
arithmetic  imit  signals,  etc.,  required  bv  them.  We  will  call  these 
various  signals  the  microsteps  of  an  output  operation.  Therefore 
to  realize  the  SD  of  a  given  fimction  we  must  implement  the 
microsteps  corresponding  to  the  output  operations. 

We  begin  by  listing  from  the  state  diagram  some  output  opera- 
tions and  their  corresponding  microsteps.  For  example,  in  state 
2  of  Fig.  4,  if  a  M.\TCH  signal  is  present  we  are  supposed  to 
Increment  the  CIO  counter  and  then  read  the  I/O  buffer. 

Consequently  the  microsteps  required  are: 

fCIO  This  signal  causes  the  CIO  to  be  incremented  by  one. 
CIO^M.\R  This  signal  causes  the  CIO  to  be  gated  to  the 

memory  address  register. 
READ  This  signal  initiates  a  memory  read  cycle. 
CHANGE  STATE  This  signal  causes  the  VMU  to  go  from  state 

2  to  state  3. 

Therefore  the  execution  of  the  above  microsteps,  in  that  order, 
would  implement  the  2-.3  transition  of  Fig.  4.  Some  microsteps 


for  the  VMU  are  listed  at  the  end  of  this  .Appendix.  The  largest 
number  of  microsteps  for  a  transition  from  one  state  to  another 
is  8,  which  occurs  in  the  transition  from  state  8  to  state  2.  Once 
this  maximum  number  of  microsteps  is  determined,  a  control  cycle 
counter  is  constructed,  which  can  count  as  high  as  this  maximum. 
Since  in  this  case  the  number  is  8  we  need  3  flip-flops  to  realize 
it.  In  addition,  a  "one  hot  line"  decoder  is  needed  such  that  at 
each  count  one  and  only  one  line  of  the  decoder  has  a  "one"  at 
its  output.  .Also  needed  is  a  state  diagram  counter  which  realizes 
the  "skeleton"  of  the  state  diagram.  This  skeletal  counter  tells  us 
which  state  we  are  in  and  which  to  change  to,  given  the  present 
input  signal  or  symbol.  Thus  the  skeletal  counter  "knows"  that 
if  the  circuit  is  in  state  2  and  a  .\1,\TCH  signal  is  present,  it  should 
change  to  state  3  upon  receipt  of  a  change  state  signal.  The  real- 
ization of  such  a  skeletal  counter  ha.s  been  described  [Bashkow, 
1964],  Now  we  use  the  outputs  of  the  skeletal  counter  which  will 
indicate  to  us  the  state  we  are  in,  the  outputs  of  the  decoder  of 
the  control  cycle  counter,  and  the  input  lines  (S,,,  S„,  MATCH, 
NO  M.\TCH)  and  connect  them  as  shown  in  Fig.  6.  Each  .\ND 
gate  in  this  figure  has  3  inputs  except  those  not  requiring  input 
line  information.  One  input  comes  from  the  input  set  (S„,  S^., 
M.ATCH,  etc.).  The  second  input  comes  from  the  state  diagram 
skeletal  counter  which  indicates  a  unique  state  of  the  state  dia- 
gram, and  finally  the  third  comes  from  the  control  cycle  counter. 
The  output  of  each  .\ND  gate  is  a  line  indicating  a  unique  micro- 
step.  The  .AND's  feed  OR  gates,  which  actually  energize  the  given 
microstep.  For  example  the  output  lead  of  the  "RE.\D"  Or  gate 
is  connected  to  the  "RE.\D"  terminal  of  the  memory. 

If  we  assume  that  the  control  cycle  counts  in  sequence  1,  2, 
etc.,  then  the  lead  numbered  1  will  go  to  the  first  microstep  of 
each  sequence.  The  one  numbered  2  will  go  the  second,  etc. 
Therefore  we  see  that  the  following  microsteps  should  be  executed 
in  the  order  listed  below  for  states  0,  1,  2,  5  of  Fig.  4.  The  circuit 
which  causes  the  execution  is  shown  in  Fig.  6. 

State  0    and  START  VMU 

CHANGE  STATE 
State  1    CIO^  SCIO 

0100  0000  1001  0101  ->  STC 

STC  MAR 

READ 

CHANGE  STATE 
State  2    and  M.\TCH 

INCREASE  CIO 
CIO^  M.\R 
READ 


380  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4     Processors  based  on  a  programming  language 


Input  lines 


Change  ST 


Change  ST 


States 

 ^0 


VARIABLE 
MATCH 
UNIT 


One  hot 
line  per  state 


Change  ST 


Of 


0100000010010101  — STC 


0 


Chonge  ST 


Increase  CIO 


Change  ST 


Change  ST 


Decrease  STC 


Decrease  STC 


Change  ST 


Fig.  6.  State  diagram  implementation. 


Chapter  31      System  design  of  a  FORTRAN  machine  381 


In  *ifatf>      nf  Fict   4  a  '^T  A  RT  \A1IT  citmal  f^L-fn;  if  tn  cfi»f*a  I  TViic 
111  alctlc  \f  UI               ^  ti  jlj'\X\i     ViVIl,'   alglidl   IdKCb  11   lU  sLdlC   1.  Allls 

dllU  INW  IVl.T.  1  V_y 1 1 

is  accomplished  l^v  the  top       D  of  Fi^.  6.  The  onlv  micros tep 

PHANPK  STATF 
n jt, i nvt Hi  J  I  /x  I  i-j 

nppHpd  ic  PH4NPF  STATF    In  KtAtt^  1  of  Fi'tr  4   fhp  npvf  r-lofW 

IICCUCU   13   V>  1 1  /\i                DliAlHf.    Ill   3lalC    X    \Jl    1711'.           llltT   llCAl    L.  IXJk.  K 

and  S 

miicfi  1  qtI"**!"       a h i rk rr  cfriifrfi  1  l  /"'Qiic^c  q  iTQUcifir^n  f/^  cfcil'^  v    In  fnic 
L/U13C  lollCl   ICaClllllii  alalc  ±1  ^aU3C3  a  11  dll^l llUll  L(J  9lctlC         111  Lllls 

CcuC  VVC  IICCU  lU  SdVC       H_/  L.  LIIllCll  Is  111  1  cldl  jlCl  iJVj  XV_/,  \  V ^  Xv^      ♦  i3V>  X  W  / 

STP  — »  MAR 

set  fhe  STf  fn  4f)4T  (40Qt  ^  STP  shown  ahove  in  RCD  form)  and 

3Cl  lliC  O  1        lU  T^/rJ'J  ^^V/r/'J      *  O  1  v -  3llU*>  11  tlUUVC  111  IJV  > l\Jl  111 /  allU 

READ 

(Tf*f  fn**  /"•r»n('**nfc  fit  fn**  ijririr**(;c  nr\\»'  in  fn*"  ^vnilir^l  T^ur\l*»  1  c^iinffir 
tC  l  LllC  C  Wll  IClll^  Ul  lllC  ctUUI  CSa  IIUW    111  lllC  i.j\llllJwl    Idl'lC  Vj  Ullll  IC 1 

THANCF  STATF 

(READ(STC)).  This  latter  is  implemented  h\'  the  two  microsteps 

STC  — *  NIAR  foll()\s'ed  l)v  a  READ  command  to  the  core  memorv. 

DFCRFASF  STP 

This  transition  from  1  to  2,  of  Fi§.  "4  is  accomplished  by  the  next 

DECREASE  STC 

o  AXD  gates  shovvn  in  Fig.  6.  The  next  AND  gates  shown  accom- 

DECREASE STC 

plish  the  transition  from  state  2  to  3  if  there  is  a  \1ATCH.  The 

SCIO  ->  CIO 

next  AND  accomplishes  the  transition  from  2  to  8  if  there  is  NO 

MATCH  (in  this  case  nothing  need  be  done).  Finally  the  lowest 

READ 

two  groups  of  AND  gates  implement  the  required  microsteps  as 

CHANGE  STATE 

the  circuit  changes  from  state  5  to  7  if  a  4-bit  digit  code  is  sensed 

or  causes  the  circuit  to  remain  in  state  5  after  decrementing  the 

STC  if  an  S-bit  variable  code  is  read. 

Chapter  32 


A  microprogrammed  implementation 
of  EULER  on  IBM  System/360  Model  30^ 


Helmut  Weber 


Summary  An  experimental  processing  system  for  the  algorithmic  language 
EULER  has  been  implemented  in  microprogramming  on  an  IBM  System/360 
Model  30  using  a  second  Read-Only  Storage  unit.  The  system  consists  of  a 
microprogrammed  compiler  and  a  microprogrammed  String  Language  In- 
terpreter, and  of  an  I/O  control  program  written  in  360  machine  language. 

The  system  is  described  and  results  are  given  in  terms  of  microprogram 
and  main  storage  space  required  and  compiler  and  interpreter  performance 
obtained.  The  role  of  microprogramming  is  stressed,  which  opens  a  new 
dimension  in  the  processing  of  interpretive  code.  The  stnicture  and  content 
of  a  higher  level  language  can  be  matched  by  an  appropriate  interpretive 
language  which  can  be  executed  efficiently  by  microprograms  on  existing 
computer  hardware. 


Introduction 

Programs  written  in  a  procedure-oriented  language  are  usually 
processed  in  two  steps.  They  are  first  translated  into  an  equivalent 
form  which  is  more  efficiently  interpretable;  then  the  translated 
text  is  interpreted  ("executed")  by  an  interpretation  mechanism. 
The  translation  process  is  a  data-invariant  and  flow-invariant 
operation.  It  consists  of  two  parts — an  analytical  part,  which 
analyzes  the  higher  level  language  text,  and  a  generative  part, 
which  builds  up  a  string  of  instructions  that  can  be  directly  inter- 
preted by  a  machine.  The  analytical  part  of  the  translator  depends 
on  the  higher  level  language;  the  generative  part  depends  on  a 
set  of  instructions  interpretable  by  a  machine.  Historically  there 
was  only  one  set  of  instructions  which  could  be  interpreted  effi- 
ciently by  a  machine,  its  "machine  language."  Figure  1  outlines 
this  scheme. 

Some  of  the  processors  of  the  IBM  Svstem/360  family  are 
microprogrammed  machines.  On  them  the  "360  machine  lan- 
guage" is  interpreted  not  by  wired-in  logic  but  by  an  interpretive 
microprogram,  stored  in  control  storage,  which  in  turn  is  inter- 
preted by  wired-in  logic.  Therefore,  in  a  certain  sense  the  360 
language  is  not  the  "machine  language  '  of  these  processors  but 
the  (efficiently  interpretable)  language  in  which  the  processors  of 

'Comm.  ACM.  vol.  10,  no.  9,  pp.  ,549-558,  September,  1967. 


the  System/360  family  are  compatible.  The  true  "machine  lan- 
guage" of  these  processors  is  their  microprogram  language.  This 
language  is  on  a  lower  level  than  the  "360  language";  it  contains 
the  elementary  operations  of  the  machine  as  operators  and  the 
elements  of  the  data  flow  and  storage  as  operands. 

Now  it  is  conceivable  to  compile  a  program  written  in  a  higher 
level  language  into  a  microprogram  language  string.  This  string 
would  undoubtedly  contain  substrings  which  occur  over  and  over 
in  the  same  sequence.  We  could  call  these  substrings  procedures 
and  move  them  out  of  the  main  string,  replacing  their  occurrence 
by  a  procedure  call  symbol,  followed  by  a  parameter  designator 
pointing  to  the  particular  procedure.  Our  object  program  then 
takes  on  the  appearance  of  a  sequence  of  call  statements.  From 
here  it  is  only  a  final  step  to  eliminate  the  call  symbols  and  furnish 
an  interpreting  mechanism  which  interprets  the  remaining  se- 
quence of  "procedure  designators." 

The  process  just  described  will  result  in  the  definition  of  a  string 
language  and  the  development  of  a  microprogrammed  interpreta- 
tion system  to  interpret  texts  in  this  string  language.  The  situation 
is  similar  to  the  System/360  case:  the  string  language  corresponds 
to  the  .360  language.  Programs  written  in  a  higher  level  language 
are  compiled  into  string  language  text  to  be  stored  in  main  storage. 
The  string  language  interpreter  corresponds  to  the  microprogram 


higher-Le- 
Lflfiguoge 


1   ;  I 


 '  Generotio 

....... :  I  i 


-  -■  dependent) 


\npui  Date 

t 


Fig.  1.  Processing  programs  written  in  higher  level  languages  via  trans- 
lation to  machine  language. 


382 


Chapter  32     A  microprogrammed  implementation  of  EULER  on  IBIVI  System  350  Model  30  383 


which  interprets  360  language  texts.  It  consists  of  a  recognizing 
part  to  read  the  next  consecutive  string  element  and  to  branch 
to  an  appropriate  action  routine  and  of  action  routines  to  execute 
the  particular  procedure  called  for  by  the  string  element. 

The  e.ssential  difference  between  our  situation  and  the  360  case 
is  that  the  string  language  reflects  the  features  of  the  particular 
higher  level  language  as  well  as  the  features  of  the  particular 
hardware  better  than  the  general  purpose  360  language. 

What  is  gained  bv  defining  this  string  language  and  by  provid- 
ing a  microprogrammed  interpreter  for  it?  From  the  method  of 
definition  described,  it  can  be  seen  that  the  elements  of  the  string 
language  correspond  directly  to  the  elements  of  the  higher  level 
language  after  all  simplifying  data-invariant  and  flow-invariant 
transformations  have  been  performed.  But  the  elements  of  the 
string  language  are  also  well-adapted  to  the  microprogram  struc- 
ture of  the  machine.  Therefore,  during  the  compiling  process  (see 
Fig.  2)  onlv  a  minimum  of  generation  is  necessarv  to  produce  the 
string  language  te,\t.  The  compiler  is  shorter  and  mns  faster. 

But  the  more  important  aspect  is  that  object  code  execution 
is  also  faster.  The  string  language  interpreter  in  case  2  will  be 
coded  to  take  care  of  all  necessary  operations  in  a  concise  form, 
whereas  in  case  1  it  will  be  necessarv  to  compile  a  whole  sequence 
of  machine  language  instructions  for  an  elementary  operation  in 
the  higher  level  language.  E.xaniples  of  this  are  the  compilation 
of  360  code  for  an  add  operation  in  COBOL  of  two  numbers  w  ith 
different  scaling  factors  or  the  compilation  of  machine  instructions 
for  table  lookup  or  search  operations,  etc.  In  these  cases  the  string 
language  interpreter  of  Fig.  2  will  execute  a  fimction  much  faster 
than  the  machine  language  interpreter  of  Fig.  I  will  execute  the 
equivalent  sequence  of  machine  language  instmctions.  Therefore, 
object  code  execution  will  be  faster  in  scheme  2. 

If  object  code  performance  is  not  as  much  in  demand  as  object 
storage  space  economv,  the  string  language  interpreter  can  also 
be  written  such  that  the  string  language  is  as  tightly  packed  as 


Input  Dora 


1 

Pfogrom  In 

Higher-Le.el 

Language 

AnQlyi;i 
(language  dependent) 

Interfned-ate 

Text  in 
String  Language 

Gene  rot  ion  of 
■  nrermediOte  te^t 

I 

Interpreter 


Fig.  2.  Processing  programs  written  in  higher  level  languages  via  trans- 
lation to  interpretive  language. 


possible  so  that  the  translated  program  is  as  compact  as  possible 
and  will  take  up  less  storage  space  than  the  equivalent  machine 
language  program  under  the  scheme  of  Fig.  I. 

These  ideas  are  applied  in  an  experimental  microprogram  sys- 
tem for  the  higher  level  language  EULER  [Wirth  and  Weber, 
1966a  and  1966b]  described  below.  Problem  areas  in  this  approach 
are  indicated  and  some  ideas  for  future  development  are  offered. 

Special  considerations  for  EULER 

The  higher  level  language  EULER  [\\  irth  and  Weber,  1966a  and 
1966b]  is  a  dvnamic  language.  This  means  that  for  programs 
written  in  it  manv  things  have  to  be  done  at  object  code  execution 
time  which  can  be  done  at  compile  time  for  other  languages. 
EULER  also  contains  ba.sic  fimctions  which  do  not  have  compara- 
ble basic  counterparts  in  the  machine  languages  of  most  machines. 
To  compile  machine  code  for  these  dynamic  properties  and  for 
those  special  fimctions  would  require  rather  lengthv  sequences  of 
machine  language  instructions,  w  hich  would  consume  considerable 
object  code  space  and  require  high  object  code  execution  time. 
Therefore,  for  a  language  like  EULER,  interpretation  at  the  string 
language  level  by  an  interpreter  into  which  the  dvnamic  features 
and  special  fimctions  are  included  by  microcode  will  yield  much 
higher  object  code  economy  and  object  code  performance  than 
compilation  to  machine  language  and  interpretation  of  this  ma- 
chine language. 

Three  examples  from  EULER  are  given  here. 

1.  Dynamic  type  handling.  To  a  variable  in  EL'LER,  constants  of 
\arying  type  can  be  assigned  dvnamicallv.  For  example  in 

A<-3;     ■   ;  A<^  4..5io_5;  ■    • ;  A  ^  true;  •  •  • ;  A  <-  ■   •  •  ■; 

the  quantities  assigned  to  the  variable  A  have  the  tvpes:  integer, 
real,  logical,  procedure.  Therefore,  in  EULER  each  qiiantitv  has 
to  carry  its  type  indicator  along  and  each  operator  operating  on 
a  variable  has  to  perform  a  dynamic  type  test.  The  adding  operator 
+  for  instance  in  A  -t-  B  has  to  test  dynamically  whether  both 
operands  are  of  type  number  (integer  or  real).  This  tvpe  testing 
is  done  bv  the  String  Language  Interpreter  in  minimum  time, 
whereas  it  would  require  extra  instructions  if  the  program  w  ere 
to  be  compiled  to  360  machine  language. 

2.  Recursive  procedures  and  dynamic  storage  allocation.  In 
EL^LER,  procedures  can  be  called  recursivelv,  e,g., 

F  <-  formal  .V;  if  .V  =  0  then  1  else  .V  '  F{X  -  lY; 


Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


and  storage  is  allocated  dynamically,  e.g., 

new  N;  •  •  ■ ;  A'  <—  4;  •  •  • ;  begin  new  A;  A  *—  list  iV; 

In  order  to  cope  with  these  problems  the  EULER  execution  system 
uses  a  run  time  stack.  Each  operation  is  accompanied  by  stack 
pointer  manipulations  which  by  the  microprogram  can  be  accom- 
plished in  minimum  time  (in  general,  even  witliout  extra  time 
because  thev  are  overlapped  with  the  operation  proper),  whereas 
extra  instnictions  would  be  required,  if  the  program  were  com- 
piled. 

3.  List  processing.  EULER  includes  a  list  processing  system,  and 
lists  are  of  a  general  tree  structure,  e.g., 

A  ^(3,  4,  (5,  6,  7),  true,  •••-■); 

List  operators  are  provided  like  tail  and  cat  and  subscripting: 
B<-A[3];  C         cat  A;  C'^tail  C; 

The  string  language  interpreter  handles  list  operations  directly  and 
efficiently  by  special  microprograms.  If  the  program  would  be 
compiled  to  360  machine  langviage,  a  sequence  of  instructions 
would  be  required  for  each  list  operation. 

EULER  system  on  IBM  System/360  Model  30 

An  experimental  processing  system  for  the  EULER  language  has 
been  written  to  demonstrate  the  validity  of  these  ideas.  It  is  a 
system  running  under  the  IBM  Basic  Operating  System  and  con- 
sists of  three  parts: 

1  A  translator,  written  in  Model  30  microcode.'  This  trans- 
lator is  a  one-pass  svntax-driven  compiler  which  translates 
EULER  source  language  programs  into  a  reverse  polish 
string  form. 

2  An  interpreter,  wiitten  in  Model  30  microcode,'  which 
interprets  string  language  programs. 

3  .\n  I/O  Control  Program  written  in  360  machine  language. - 
This  lOCP  links  the  translator  and  interpreter  to  the  oper- 
ating svstem  and  handles  all  I/O  requests  of  the  translator 
and  interpreter. 

'Stored  in  the  second  Read-Onlv  Storage  (Compatibilitv  ROS)  of  Model 
.30. 

-The  360  microprograms  are  stored  in  the  first  Read-Only  Storage  (360 
ROS)  of  the  Model  30. 


The  system  is  an  experimental  system.  Not  all  the  features  of 
EULER  are  included, — only  the  general  principles  that  are  to  be 
demonstrated.  The  restrictions  are: 

1  Real  numbers  are  not  included;  onlv  integers  are  recog- 
nized. 

2  The  interpreter  microprograms  for  the  operators  Divide, 
Integer  Divide,  Remainder,  and  Exponentiation  have  not 
been  coded. 

3  The  type  'symbol'  is  not  included. 

4  No  garbage  collector  is  provided.  Therefore,  the  svstem 
comes  to  an  error  stop  if  a  list  processing  program  has  used 
up  all  available  storage  space  (32K  bvtes). 

Also  for  reasons  of  simplicity,  the  svstem  is  written  only  for 
a  64K  System/360  Model  30  and  the  storage  areas  for  tables, 
compiled  programs,  stacks  and  free  space  are  assigned  fixed  ad- 
dresses. 

The  string  language  into  which  source  programs  are  translated 
is  defined  as  closely  as  possible  to  the  interpretive  language  used 
in  the  definition  of  EULER  [Wirth  and  Weber,  1966a  and  1966b]. 
The  question  whether  this  is  the  ideal  directly  interpretable  lan- 
guage corresponding  to  the  EL'LER  source  language  given  the 
Model  30  hardware  is  left  open.  Also  no  attempt  is  made  to  define 
the  string  language  so  that  it  becomes  relocatable  for  use  in  time 
sharing  or  conversational  processing  mode. 

The  three  storage  areas  used  by  the  execution  system  are: 

1  Program  area 

2  Stack 

3  Variable  area 

Program  area.  A  translated  program  in  string  language  consists  of 
a  sequence  of  one-byte  symbols  for  the  operators  (  +  ,  — ,  begin, 
end,  <— ,  go  to,  etc.).  Some  of  the  symbols  have  trailer  bytes  associ- 
ated with  them;  for  instance,  the  symbol  +  number  has  three 
trailer  bytes  for  a  24-bit  absolute  value  of  the  integer  constant. 

[number        y  0  I  U  e  | 

The  symbol  reference  (@)  has  two  trailer  b\1:es,  one  containing 
the  block  number  (bn).  the  second  one  the  ordinal  number  (on). 

|@         I  on  I 


Chapter  32  [  A  microprogrammed  implementation  of  EULER  on  IBM  System /360  Model  30  385 


The  operators  then,  else,  and,  or  and  '  have  two  trailer  bytes 
containing  a  16-bit  absolute  program  address,  e.g., 


then 


pa 


Other  operators  with  trailer  byt( 
operator. 


es  are  label  and  the  list-lniilding 


Stack.  The  execution  time  stack  consists  of  a  sequence  of  .32-i>it 
words.  It  contains  block  and  procedure  marks  to  control  the  proc- 
essing of  blocks  and  procedures  and  temporary  values  of  the 
various  types.  The  first  4-bit  digit  of  a  word  in  stack  always  is 
a  type  indicator.  The  format  of  these  words  is  given  in  Fig.  3. 


Variable  area.  The  variable  area  is  an  area  (32K  bytes  long)  of 
•32-bit  words  used  for  the  storage  of  values  assigned  to  variables 
and  lists  (and  also  for  au.xiliarv  words  in  procedure  descriptors; 
see  type  procedure  in  Fig.  3).  The  format  of  the  entries  is  exactly 
the  same  as  the  format  of  the  stack  entries  (see  Fig.  3),  the  only 
exception  being  that  a  mark  can  never  occur  in  the  variable  area. 

Microprogramming  the  IBM  System /360  Model  30 
[Fagg  et  al.,  1964] 

Mic  roprograms  are  sequences  of  microprogram  words.  .\  micro- 
program word  is  composed  of  60  bits  and  contains  various  fields 
which  control  the  basic  functions  in  the  IBM  Svstem/360  Model 
30  CPL'.  These  basic  fimctions  are  storage  control,  control  of  the 


Type  iindefinefl 


Type  inte(;er 
sign:     -h  0 
-  1 


Type  logical 
value:     trtie  1 
false  0 


v'alue 

e  in  liex. 

idi'cinial  I  <  ICj') 

Type  label 


4!  mp 


mp:    mark  pointer,  points  to  the  stack  location  of  the  mark  for 
the  block  in  which  the  label  is  defined, 
pa:    10-bit  absolute  program  address 


Type    reference       ^  i  p 


OC 


mp:  mark  pointer,  points  to  the  stack  location  of  the  mark  for 
the  block  in  which  the  variable  is  defined. 

loc:  location  of  woffl  in  variaVjle  area  which  contains  value 
assigned  to  variable. 


Type  procedure 


6;  mp 

link 
1 

6% 

bn 

p'q 

rnp:    niark  pointer,  point-      the  -t;irk  location  of  the  mark  for 
the  block   or  procedure   in  which  the  procedure  is  defined, 
link:    pointer  to  a  wrird  in  variable  area  which  contains 
additional  information 

bn:  block  number  of  the  lilock  'or  procedure)  in  which  the 
procedure  is  defined, 

pa:  16-bit  program  addre-',  where  string  code  for  procedure 
."Starts. 


T\-pe  list 


yilength  loc 


length:    number  of  el<Tiic-iii>  in  li^l  '  <  hrj 

loc:  16-biT  location  of  first  Ii=t  element  in  variable  area  'lists 
are  stored  in  cmi-f-c  ii  i  ve  >'"r;ige  l  p-^  at  iori~  i , 


Mark 


Q  I  static 
!  'I"'' 

bn 

Q  idynomic 
I 

return 
oddr^ss 

7;length 

loc 

A  mark  consists  ui  3  words  in  stack,  it  is  btuh  each  time  a  block  or 
a  procedure  is  entered. 

static  link:    static  link  tn  mark  of  embracing  block. 

bn :    block  number. 

dynamic  link:  dynamic  link  to  mark  of  embracing  block  (or 
procedure!. 

return  address:    16-bit  program  address  to  which  to  return 
upon  normal  exit  of  procedure  ifor  procedure  marks  only,  this 
field  is  0  for  block  marks). 
The  last  stack  word  in  a  mark  is  a  list  descriptor  (see  type  list) 
for  the  variable  list  (in  a  block  mark)  or  the  actual  parameter  list 
(in  a  procedure  mark"). 


Fig.  3.  Format  of  words  in  stack  and  variable  area. 


386  Part  4  [  The  instruction-set  processor  level:  special-function  processors 


Section  4     Processors  based  on  a  programming  language 


Z  BUS 


A  BUS 


BUS 


M  BUS 
N  BUS 


CK 


CK 


CORE 
STORAGE 


local 


T/C 


+  LAST  CARRY 


CARRY 


Fig.  4.  Simplified  data  flow  of  the  IBM  System/360  Model  30. 

data  flow  registers  and  the  Arithmetic-Logic-Unit  (ALU ),  micro- 
program sequencing  and  branching  control,  and  status  bit-setting 
control.  Microprogram  words  are  stored  in  a  Card  Capacitor 
Read-Only  Storage  (CCROS).  Fetching  one  microprogram  word 
and  executing  it  takes  750  nsec,  the  basic  machine  cycle. 

Figure  4  shows  in  simplified  form  the  data  flow  of  the  IBM 
System/36n  (IBM  2030  CPU).  It  consists  of  a  core  storage  with 
up  to  65,536  8-bit  bytes  and  a  local  storage  (accessible  by  the 
microprogrammer  but  not  explicitly  by  the  360  language  pro- 
grammer), a  16-bit  storage  address  register  (M,  N),  a  set  of  10  8-bit 
data  registers  (I,  J,  •  ■  ■  ,  R),  an  arithmetic-logic-unit  (ALU),  con- 
necting 8-bit  wide  buses  (Z,  A,  B,  M,  N-bus),  temporary  registers 
(A,  B),  switches  and  gates. 

Figure  5  shows  the  more  important  fields  of  a  microprogram 
word.  Onlv  47  bits  are  shown.  Other  fields  contain  various  parity 
bits  and  special  control  bits.  The  field  interpretation  given  in  Fig. 


5  is  as  for  microprogram  words  in  the  second  Read-Only  Storage 
unit  (Compatibilitv  ROS)  if  the  machine  is  equipped  with  the  1620 
Compatibility  Feature.  The  meaning  of  the  microprogram  word 
fields  is  explained  in  connection  with  Fig.  6  which  shows  the 
symbolic  representation  of  a  microprogram  word  together  with 
an  example  as  it  appears  on  a  microprogram  documentation  sheet. 

The  fields  of  the  microprogram  word  can  be  grouped  in  five 
categories: 

1  ALU  control  fields:  CA,  CF,  CB,  CG,  CV,  CD,  CC 

2  Storage  control  fields:  CM,  CU 

3  Microprogram  sequencing  and  branching  fields:  CN,  CH, 
CL 

4  Status  bit  setting  field:  CS 

5  Constant  field:  CK 


Chapter  32     A  microprogrammed  implementation  of  EULER  on  IBM  System/360  Model  30  387 


1  CN 

CH 

CL 

CM 

CU 

CA 

CB 

CK 

CD 

CF    1  CG  1  CV 

CC 

CS  i 

0000 

0 

0 

Write 

MS 

R 

0 

z 

0 

0  1  + 

+0 

No  sroTM 
setting 

0001 

i 

i 

No  o«eu 

LS 

L 

1 

L 

L 

*1 

LE— S5 

0010 

RO 

» 

Store 

D 

2 

And 

Hi— S4 

001  1 

Si. 

* 

IJ— MN 

* 

K 

3 

Thr^ 
roug 

r.  [ 

Or 

0100 

« 

Gl 

UV— MN 

S 

4 

» 

•0.»>eC 

0-*S4.0-*S3 

0101 

» 

R-Vohd  dec 

LT— MN 

5 

* 

XL 

♦l.soy«C 

1— SI 

01  10 

ALU  Corry 

Rl 

* 

* 

6 

s 

XH 

•C.«N«C 

0— SO 

0111 

SO 

i'O 

R 

7 

R 

X 

XOR 

X--S0 

1  000 

R2 

G7 

0 

8 

0 

0— S2 

1  001 

S2 

S3 

L 

9 

L 

ANSNE— S2 

1010 

S4 

S5 

G 

X'A'' 

G 

0— 36 

ion 

S6 

87 

T 

X'B' 

T 

X— S6 

1  too 

GO 

R3 

V 

x'C 

V 

0— S7 

1  101 

G2 

G3 

U 

x'O' 

U 

i— S7 

1110 

G4 

G5 

J 

X'E' 

J 

* 

1111 

G6 

Interrupt 

I 

X'F' 

I 

0— Si 

'X'A'  meons  hexodecimol  digit  A  =  1010 

Fig.  5.  IBM  System/360  Model  30  microprogram  word.  (Detailed  explanation  is  provided  in  text.)  The  field  inter- 
pretation is  given  for  microprogram  words  in  compatibility  ROS  if  the  machine  is  equipped  with  the  1620  compati- 
bility feature.  Fields  marked  "'"  contain  designators  not  explained  here  in  order  not  to  confuse  the  basic  principles. 


ALU  control  fields.  On  the  line  designated  ".•KLL'"  in  Fig.  6,  an 
.\LU  statement  can  appear.  It  will  specify  an  .-V-source  and  a 
B-soiirce,  possibly  an  .'\-source  modifier  and  a  B-source  modifier, 
an  operator,  a  destination,  and  possibly  a  carrv-in  control  and  a 
carry-out  control. 

CA  is  the  A-source  field.  It  controls  which  one  of  the  10  8-bit 
data  registers  is  connected  to  the  transient  .\-register  and  therefore 
to  the  A-input  of  the  ALU. 

CB  is  the  B-source  field.  It  controls  whether  the  R,  L,  or 
D-register  or  the  CK-field  is  connected  to  the  transient  B-register 
and  therefore  to  the  B-input  of  the  ALV.  If  "K"  (CB  =  3)  is  speci- 
fied in  this  field,  the  4-bit  constant  field  CK  is  doubled  up;  i.e.,  the 
same  four  bits  are  used  as  the  high  digit  and  the  low  digit. 

Between  the  A-register  and  the  ALU  input  is  a  straight/cross 
switch  and  a  high/low  gate.  Its  function  is  controlled  by  the 
CF-field.  Depending  on  the  value  of  this  field,  no  input  is  gated 
into  the  ALU  (0)  or  only  the  low  (L)  or  high  digit  (H)  is  admitted. 
CF  =  .3  gates  all  eight  bits  straight  through,  whereas  the  codes 
CF  =  5,  6,  and  7  cross  over  the  two  digits  of  the  byte  before 
admitting  the  low  (XL)  or  high  digit  (XH)  or  both  digits  (X). 

Between  the  B-register  and  the  ALL'  input  is  a  high/low  gate 
and  a  true/complement  control.  The  high/low  gate  is  controlled 
by  the  CG-field  in  the  same  manner  as  the  high/low  gate  in  the 
A-input.  The  true/ complement  control  is  operated  by  the  CV-field. 
It  admits  the  true  byte  to  the  .\LU  or  the  inverted  byte  (  — ) 
or  controls  a  six-correct  mechanism  for  decimal  addition  (@). 

The  operator  and  carry  controls  are  given  by  the  CC-field.  This  pjg.  g.  Symbolic  representation  of  a  System/360  Model  30  micro- 
field  specifies  binary  addition  without  carry  handling  (-1-0),  addi-       program  word. 


X6X7  - 

CONSTANT 
ALU 

STORAGE 
STATUS  SETTING 

BRANCHING 
COORO  


SEQUENCE 
-COORD 


Format  of  symbolic  representation 


01  

1101 

R-l-  KH  —DC 
WRITE 

HZ  —34.  LZ— 35 


G4,G5 

C4  


C4 

-CD 


Example 


388   Part  4  j  The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


tion  with  injection  of  a  1  ( + 1)  (for  instance,  to  simulate  subtraction 
in  connection  with  the  B-input  inverter),  addition  with  saving  the 
carry  in  bit  3  of  register  S  (  +  0,Save  C,  and  +  l,Save  C),  and 
addition  using  an  old  carry  stored  in  bit  3  of  register  S  and  saving 
the  new  carry  in  this  same  bit  (  +  C,Save  C).  Other  codes  specify 
logical  operations  (AND,  OR,  XOR). 

The  CD-field  specifies  into  which  register  the  result  of  the  ALU 
operation  is  gated.  Any  one  of  the  10  data  registers  can  be  speci- 
fied. Z  means  that  the  ALU  output  is  gated  nowhere  and  will  be 
lost. 

Storage  control  fields.  On  the  line  designated  "storage"  in  Figure 
6,  a  storage  statement  can  appear.  It  will  specify  whether  this 
microcycle  is  a  ready  cycle,  a  write  cycle,  a  store  cycle  or  a 
no-storage  access  cycle,  and  from  where  the  storage  address  is 
supplied  (CM-field)  and  whether  storage  access  is  to  main  storage 
or  local  storage  (CU-fleld).  Note  that  a  full  storage  cycle  (1.5  jusec) 
corresponds  to  two  read-only  storage  cycles  (750  nsec). 

The  codes  CM  =  3,  4,  or  5  specify  read  cycles.  The  addresses 
are  supplied  from  the  register  pairs  IJ,  UV,  and  LT,  respectively. 
A  read  cycle  reads  one  byte  of  data  from  core  storage  into  the 
storage  data  register  R. 

A  write  cycle  regenerates  the  data  from  the  storage  data  regis- 
ter R  at  the  address  supplied  in  the  last  read  cycle. 

A  store  cycle  acts  exactly  as  a  write  cycle  except  that  it  inhibits 
in  the  read  cycle  immediately  preceding  it  the  insertion  of  the 
data  byte  from  storage  into  the  R-register. 

The  CU-field  specifies  whether  storage  access  should  be  to  main 
storage  (MS)  or  to  a  local  storage  of  256  bytes  not  explicitly  ad- 
dressable by  the  360  language  programmer. 

Microprogram  sequencing  and  branching.  Each  microprogram 
word  is  stored  at  a  unique  address  in  ROS.  A  13-bit  ROS  address 
register  (W3  •  •  ■  W7,  XO  •  ■  •  X7)  holds  the  address  of  the  word  being 
executed.  For  the  symbolic  representation  of  a  microprogram  (Fig. 
6)  the  ROS  address  is  given  in  hexadecimal  in  the  upper  right 
comer,  and  the  last  two  bits  of  this  address  are  repeated  in  binary 
on  the  upper  margin. 

After  execution  of  a  microprogram  step,  the  next  sequential 
word  will  not  be  executed.  Instead  the  address  of  the  next  word 
to  be  executed  is  derived  as  follows.  The  high  five  bits  (W)  remain 
the  same,  unless  they  are  changed  by  a  special  command  in  the 
microword,  not  explained  here  (so-called  module  switching).  The 
next  six  bits  (XO  -  •  -  XS)  are  supplied  from  the  CN-field  (written 
in  hexadecimal  in  the  symbolic  representation  of  Fig.  6).  The  low 
two  bits  are  set  according  to  conditions  specified  in  the  CH  and 
CL  fields.  X6  is  set  according  to  the  condition  specified  by  CH. 


For  instance,  if  CH  =  8,  then  the  bit  R2  is  transferred  to  X6;  if 
CH  =  6,  then  X6  is  set  to  one  if  in  the  last  ALU  operation  a  carry 
had  occurred.  It  is  set  to  zero  if  no  carry  had  occurred.  X7  is 
controlled  by  CL.  If,  for  instance,  CL  =  0,  then  X7  is  set  to  zero; 
if  X7  =  5,  then  X7  is  set  to  one  if  both  digits  in  R  are  valid  decimal 
digits  (i.e.,  RO  -  ■  -  RS  <  9  and  R4-  -R"  <  9),  X7  is  set  to  zero  if 
either  digit  in  R  is  not  a  valid  decimal  digit  (i.e.,  RO  -  ■  RS  >  9 
or  R4  ■  •  •  R7  >  9).  This  microprogram  sequencing  scheme  allows 
a  four-way  branch  after  the  execution  of  each  microprogram  word. 

Status  bit  setting.  The  CS-field  allows  the  unconditional  or  condi- 
tional setting  of  certain  status  bits  to  be  specified,  combined  in 
Register  S.  If,  for  instance,  CS  =  3,  then  S4  is  set  to  one  if  the 
result  of  the  ALU  operation  performed  in  this  microprogram  cycle 
shows  a  zero  in  the  high  digit  (i.e.,  ZO  =  Zl  =  Z2  =  Z3  —  0);  S4 
is  set  to  zero  otherwise.  At  the  same  time,  S5  is  set  to  one  if  the 
result  of  the  ALU  operation  shows  a  zero  in  the  low  digit  (i.e., 
Z4  =  Z5  =  Z6  =  Z7  =  0);  S5  is  set  to  zero  otherwise.  If  CS  =  9, 
then  S2  is  set  to  one  if  the  result  of  the  ALU  operation  is  not 
zero  (i.e.,  at  least  one  of  the  bits  ZO  -  -  •Z7  is  equal  to  1).  If  the 
result  of  the  ALU  operation  is  zero,  then  S2  is  not  changed. 

Constant  field.  The  4-bit  CK-field  is  used  for  various  purposes.  One 
instance  explained  in  the  ALU  statement  is  to  supply  a  constant 
B-source  for  an  ALU  operation.  Other  examples  not  explained  here 
any  further  are  the  addressing  of  a  few  specific  scratchpad  local 
storage  locations,  module  switching  (replacement  of  the  high  part 
W  of  the  ROS  address),  and  the  control  of  certain  special  fimctions. 

Symbolic  representation  of  microprograms.  Microprograms  are 
symbolically  represented  as  a  network  of  boxes  (Fig.  6)  each 
representing  a  microword,  connected  by  nets  indicating  the  pos- 
sible branching  ways.  Figure  7  gives  an  example  of  a  microprogram 
(to  be  explained  in  the  next  section).  There  exist  programming 
systems  to  aid  in  the  development  of  microprograms.  They  contain 
symbolic  translators  to  translate  the  contents  of  a  box  according 
to  Fig.  6  into  the  contents  of  the  actual  fields  of  the  microprogram 
word  according  to  Fig.  5.  A  drawing  program  generates  documen- 
tation (Fig.  7  is  drawn  with  such  a  program).  These  systems  usually 
also  contain  programs  for  simulation  and  generation  of  the  actual 
ROS  cards. 

String  language  interpreter  for  EULER 

The  string  language  interpreter  for  EULER  is  entirely  written  in 
Model  30  microcode.  It  consists  of  a  few  microprogram  steps  to 
read  the  next  sequential  symbol  from  the  program  string  and  to 


Chapter  32  [  A  microprogrammed  implementation  of  EULER  on  IBM  System/360  Model  30  389 


Fig.  7.  Microprogram  for  the  operators  AND,  OR,  and  THEN. 


do  a  function  branch  on  the  symbol  and  of  a  group  of  micropro- 
gram routines  which  perform  the  necessary  operations  for  the 
program  bvte  read.  These  routines  also  take  care  of  dvTiamic  type 
testing  and  stack  pointer  manipulations.  The  routines  are  equiva- 
lent to  the  routines  described  in  the  definition  of  the  string  lan- 
guage for  EULER  [Wirth  and  Weber.  1966a  and  1966b]. 

Figure  7  shows,  as  an  example,  the  microprogram  to  interpret 
the  program  string  symbols  and  (internal  representation  X'52'^), 
or  .X'.5()'  and  then  X'5.3'.  These  operators  test  if  the  highest  entry 
in  the  stack  is  a  value  of  type  logical.  The  logical  operators  in 
EULER  work  in  the  FORTRAN  sense,  not  in  the  .\LGOL  sense: 
if  after  the  evaluation  of  the  first  operand  the  result  is  determined 
(false  for  and,  true  for  or),  then  the  second  operand  is  not  evalu- 
ated but  skipped  over.  If  an  and  operator  finds  the  value  false, 
then  a  branch  occurs  to  the  program  address  given  in  the  two 

'X  'nn'  represents  the  hexadecimal  number  composed  of  the  digits  ii 
(n  =  0  9,  A  ,  F). 


trailer  bytes.  If  an  and  finds  the  value  true,  then  it  deletes  this 
value  from  the  stack  and  proceeds  to  the  next  symbol  in  the  pro- 
gram string  (to  evaluate  the  second  operand  of  and).  Similarly  if 
an  or  operator  finds  the  value  true,  then  a  branch  occurs  to  the 
program  address  given  in  the  two  trailer  bvtes.  If  an  or  finds  the 
value  false,  then  it  deletes  this  value  from  the  stack  and  proceeds 
to  the  next  symbol  in  the  program  string.  The  then  operator  is  a 
conditional  branch  code:  it  deletes  the  logical  value  from  the 
stack.  If  this  value  was  false,  then  a  branch  is  taken  to  the  program 
address  given  in  the  two  trailer  bytes.  If  this  value  was  true,  then 
the  next  SN  mbol  in  the  program  string  is  executed. 

The  pointer  to  the  symbol  in  the  program  string  (the  instruction 
counter)  is  located  in  the  functionally  associated  pair  of  registers 
I  and  J  in  the  Model  .30.  The  pointer  to  the  left-most  b\  te  of  the 
highest  entry  in  the  stack  (the  stack  pointer)  is  located  in  the  two 
registers  U  and  V  in  the  Model  .30. 

In  the  following  the  individual  steps  in  this  microprogram  are 
explained  in  more  detail. 


390  Part  4     The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


Location 
Address    in  Figure  Description 


Location 

Address    in  Figure  Description 


1161:       CI:  The  Instruction  counter  IJ  addresses  main  stor- 

age. The  addressed  byte  in  main  storage  is 
read  out  into  the  storage  data  register  R.  The 
instruction  counter  is  updated  by  adding  1  to 
register  J.  A  possible  carry  Is  saved  to  be  added 
to  1. 

1117;       C2:  The  operator  has  been  read  out  from  main 

storage  Into  R.  It  Is  also  transferred  (through 
the  ALU)  to  register  G.  A  four-way  branch  occurs 
on  the  two  highest  bits  RO  and  Rl  of  the  oper- 
ator. For  the  operators  52,  53,  and  50  this 
branch  goes  to  ROS  word  1171,  whereas  other 
operators  cause  a  branch  to  1170,  1172,  or 
1 173,  Indicated  by  the  three  lines  not  continued. 

1171:       C3:  To  complete  the  updating  of  the  Instruction 

counter,  the  carry  from  1161  is  added  Into  I. 
The  first  byte  of  the  highest  entry  of  the  stack 
Is  addressed  by  UV  and  read  out  Into  R.  A  fur- 
ther four-way  branch  on  the  operator  is  made 
(G2,  G3).  For  our  operators  the  branch  goes  to 
USD. 

115D:       C4:  The  high  order  byte  of  the  highest  stack  entry 

has  been  read  out  of  storage  into  R.  It  contains 
the  type  of  entry  In  the  high  digit  and  if  this 
type  was  logical  then  it  contains  the  value  true 
(1)  or  false  (0)  in  the  second  digit.  This  byte  is 
tested  by  adding  X'DO'  to  It  and  observing  the 
result.  Ignoring  the  carry.  S4  Is  set  to  1  when 
the  type  was  3  (logical)  otherwise  to  0.  S5  Is 
set  to  1  when  the  low  digit  of  this  byte  was  0 
(value  false),  S5  is  set  to  0  when  the  low  digit 
of  this  byte  was  1  (value  true).  Another  four- 
way  branch  occurs  on  the  bits  G4  and  G5  of  the 
operator.  If  the  operator  is  50(or),  51  (cannot 
occur),  52  (and),  or  53(then),  then  a  branch  to 
1 1C4  occurs. 

11C4:       L4:  The  next  byte  is  read  from  the  program  string, 

it  Is  the  high  byte  of  the  two-byte  program  ad- 
dress trailing  the  operator.  The  Instruction 
counter  Is  updated  again  by  adding  a  1  to  J, 
saving  a  possible  carry.  Another  four-way  branch 
occurs  on  the  bit  G6  of  the  operator  and  the 
value  of  the  stack  entry.  If  the  operator  was 
and  or  then  (G6  =  1)  and  the  value  was  false 
(S5  =  1),  then  branching  to  1 ICB  occurs;  If 
the  operator  was  or  (G6  =  0)  and  the  value 
was  true  (S5  =  0),  then  branching  to  11C8 
occurs.  If  the  operator  was  or  (G6  =  0)  and 
the  value  was  false  (S5  =  1),  then  branching 


to  1 1C9  occurs.  If  the  operator  was  and  or  then 
(G6  =  1)  and  the  value  was  true  (S5  =  0), 
then  branching  to  IICA  occurs. 

IICB;       G5;  This  word  Is  executed  for  the  operators  and  and 

then  when  the  value  was  false.  Here  the  type 
test  Is  made.  If  the  type  was  not  logical  (S4  =  0), 
then  a  branch  to  llCl  occurs.  If  the  type  was 
correct,  then  the  microprogram  proceeds  to 
fetching  the  trailing  program  address  (two  bytes) 
to  store  It  as  the  new  instruction  counter  In  IJ. 
This  Is  done  for  the  and  operator  (G7  =  0)  In 
this  word  and  the  following  two  words  11C3 
and  11  IE;  for  the  then  operator  (G7  =  1)  It  Is 
done  In  this  word  and  the  words  1 1C3  and  11  IF. 

11C3,       J6,  J7;       The  two  bytes  trailing  of  the  operators  and  or 
1 1  IE;  or  are  stored  as  the  new  Instruction  counter  IJ. 

The  operation  Is  completed.  The  microprogram 
branches  back  to  1161  to  read  out  the  next 
operator. 

11C3,       J6,  L7;       The  two  bytes  trailing  of  the  operator  then  are 
11  IF;  stored  as  the  new  instruction  counter  In  IJ.  The 

carry-saving  bit  S3  is  forced  to  zero. 

UCE,       N8,  N9;      The  stackpointer  is  decremented  by  four  (the 
1144:  operator  '-'  means  complement  add)  which  in 

effect  deletes  the  highest  entry  from  the  stack. 
Observe  that  when  these  two  words  are  entered 
from  11  IF  (then  operator  with  value  false)  the 
microprogram  will  not  go  through  1145  be- 
cause we  have  forced  S3  to  zero  In  1 1 1 F.  The 
operation  is  completed,  and  the  microprogram 
branches  back  to  1161  to  read  out  the  next 
operator. 

11C8;       J5;  This  word  Is  executed  for  the  operator  or  when 

the  value  was  true.  Similarly  as  In  IICB,  the 
typetest  Is  taken.  For  types  not  logical  a  branch 
to  llCl  occurs.  If  the  type  was  correct,  then 
the  microprogram  proceeds  to  fetching  the 
trailing  program  address  (two  bytes)  to  store  it 
as  the  new  Instruction  counter  in  IJ  (words 
11C3,  lllE). 

11C9:       N5;  This  word  Is  executed  for  the  operator  or  when 

the  value  was  false.  A  typetest  Is  made.  If  the 
type  was  correct,  then  the  trailing  program  ad- 
dress Is  skipped  and  IJ  Is  updated  by  1  twice 
in  1 1C4,  1 1C9  (possible  carries  out  of  J  handled 
In  IICF  or  1145).  The  stackpointer  is  decre- 
mented by  four  In  1  ICE,  1 144. 

llCA;       Q5:  This  word  Is  executed  for  the  operators  and  and 

then  when  the  value  was  true.  A  typetest  Is 
made.  If  the  type  was  correct  then  the  trailing 


Chapter  32  |  A  microprogrammed  implementation  of  EULER  on  IBM  System/360  Model  30  391 


Address 


Location 
in  Figure- 


Description 


address  is  skipped.  IJ  is  updated  by  1  twice  in 
11C4,  IICA  (possible  carries  out  of  J  handled 
in  IICF  or  1145).  The  stackpointer  is  decre- 
mented by  four  in  IICE,  1 144. 

IICI,       G6,  L6,  N6  These  words  are  executed  when  a  typetest 
IICC,  occurs.  An  error  code  01  is  set  up  in  L  and  a 

IICD:  branch  occurs  to  the  error  routine  not  drawn 

here. 


The  total  ROS  space  requirement  for  the  String  Language  In- 
terpreter is: 


Coded  routines 
Routines  for  real  number 

handling 
Divide,  E.xponentiation,  etc. 
Garbage  collector 


1000  microwords 
500  microwords 

4(H)  microwords 
600  microwords 
2.5(H)  microwords 


(estimated) 

(estimated) 
(estimated) 


It  can  be  seen  from  Fig.  7  that  the  execution  times  of  the 
microprograms  including  the  readout  of  the  operator  (I-Cycle)  are 
the  following: 

and     6  jusec'  (8  microprogram  steps) 

or       6  |Usec  (8  microprogram  steps) 

then    6  losec  for  value  true  (8  microprogram  steps) 

7.5  /iisec  for  value  false  (10  microprogram  steps) 

In  order  to  compare  this  with  a  hvpothetical  EULER  system 
for  Svsteni/.36()  language,  let  us  assume  that  the  compiler  produces 
in-line  code  {which  probably  will  give  the  highest  performance 
although  it  will  be  very  wasteful  with  respect  to  storage  space >. 
Then  a  reasonable  sequence  for  and  might  be: 

CLI     0  (ST.\CK),  LOCF.\LSE 

BE  .\NDF.\LSE 

CLI    0  (STACK),  LOGTRUE 

BNE  TYPEERR 

SH      ST.\CK.  =  '4' 

Timing:  true:  90  fisec;  false:  .32  ^iisec. 

This  comparison  seems  to  indicate  that  the  microprogram  in- 
terpreter is  about  an  order  of  magnitude  faster  than  the  equivalent 
program  in  '360  language.  However,  this  comparison  will  only  yield 
such  a  high  factor  for  functions  of  EULER  which  do  not  have 
simple  Systeni/.360  language  comiterparts  (as  for  instance  the 
list-operators,  begin-,  end-,  and  procedure-call-operator)  or  where 
the  overhead  for  dynamic  testing  and  stackpointer  manipulation 
is  heavv  as  in  the  above  example  of  the  logical  operations.  For 
functions  which  do  have  System/360  language  counterparts  and 
which  are  slower  so  that  the  overhead  is  relatively  lighter  as.  for 
instance,  arithmetic  operations  (especially  for  real  numbers),  the 
microprogrammed  interprete'-  will  still  be  faster  than  the  System/ 
.360  language  program,  but  not  bv  a  factor  of  10. 

'The  cases  where  carries  occur  in  the  IJ  and  U\"  updating  are  disregarded 
for  timing  purposes. 


EULER  compiler 

The  translator  to  translate  EULER  source  language  into  the  Re- 
verse Polish  String  Language  is  a  one-pass,  syntax-driven  compiler. 
The  syntax  of  the  language  and  the  precedence  functions  F  and 
G  over  the  terminal  and  nonterminal  symbols  are  stored  in  table 
form  in  Model  .30  main  storage.  There  is  also  main  storage  space 
reserved  for  translation  tables  for  character  delimiters  and  word 
delimiters  and  for  a  compile  time  stack,  a  name  table,  and,  of 
course,  for  the  compiled  code.  .\11  these  areas  are  at  fixed  storage 
locations  because  of  the  experimental  nature  of  the  system. 
The  microprogram  consists  of  the  following  parts: 

1  .\  routine  reads  the  next  input  character  from  the  input 
buffer  to  translate  it  to  a  1-byte  internal  format,  if  it  is  a 
delimiter,  or  to  collect  it  into  a  name  buffer  if  it  is  part 
of  an  identifier,  or  to  convert  it  to  hexadecimal  if  it  is  part 
of  a  numeric  constant  and  to  collect  the  number  into  a 
buffer.  This  "prescan"  requires  100 -|-  microwords. 

2  .\s  soon  as  an  input  unit  is  collected  (delimiter,  identifier, 
number)  the  main  parsing  loop  is  entered  which  makes  u.se 
of  the  precedence  tables  and  the  syntax  table  in  main  stor- 
age. This  SNTitactic  analvzer  loop  requires  100—  micro- 
words. 

3  W  hen  the  parsing  loop  identifies  a  s\Titactic  unit  to  be 
reduced,  it  calls  the  appropriate  generation  routine  which 
performs  essentially  the  finictions  described  as  the  semantic 
interpretation  rules  in  the  EULER  definition.  The  micro- 
program space  required  for  these  programs  amounts  to 
approximately  2.50  ROS  words. 

4  If  a  syntactic  error  is  detected,  the  system  signals  an  error 
and  does  not  trv  to  continue  with  the  compilation  process. 
Though  this  procedure  is  totally  inadequate  for  a  practically 
useful  system,  it  was  deemed  sufficient  to  prove  the  essential 
point.  For  this  minimum  error  analysis  and  for  linkage  to 
the  360  microprograms  (lOCP).  approximately  60  micro- 
words  are  required. 


Part  4  I  The  instruction-set  processor  level:  special-function  processors 


Section  4  |  Processors  based  on  a  programming  language 


The  total  compiler  microprogram  space  is  therefore  approxi- 
mately 500  ROS  words.  The  total  main  storage  space  required  is 
approximately  1200  bytes. 

The  speed  of  this  compiler  is  limited  by  the  speed  of  the  card- 
reader  of  the  system  (1000  cards/minute).  This  excellent  per- 
formance has  three  main  reasons:  (1)  EULER  as  a  simple  prece- 
dence language  is  a  language  extremely  easy  to  compile.  (2)  The 
functions  of  a  compiler  are  mainly  of  a  table  lookup  and  bit  and 
byte-testing  type.  Microprogramming  is  extremely  well-suited  for 
these  kinds  of  operations.  (3)  Since  the  target  language  is  String 
Code  and  not,  for  example,  360  Machine  Language,  the  generative 
part  of  the  compiler  is  relatively  short. 

It  is  very  difficult  to  assess  the  individual  contributions  of  these 
three  main  reasons  to  the  high  compiler  performance.  Therefore, 
it  is  not  possible  at  this  stage  to  make  a  statement  as  to  whether 
the  nature  of  the  language  EULER  or  the  fact  that  the  compiler 
is  microprogrammed  is  the  dominant  factor. 

Development  of  the  microprogram 

Since  there  is  no  higher  level  language  to  express  microprogram 
procedures  and  no  compiler  to  compile  microcode,  the  micropro- 
grams were  written  in  the  symbolic  language  explained  in  Fig. 
6.  Actually  the  process  was  a  hand  translation  of  the  algorithms 
in  the  EULER  definition  to  the  symbolic  microprogram  language. 
The  microprograms  were  translated  into  actual  microcode  and 
simulated  before  they  were  put  on  the  System/360  Model  30  by 
means  of  a  general  microprogram  development  system. 

Outlook  and  general  discussion 

It  is  hoped  that  the  development  of  this  experimental  system  for 
EULER  shows  that  with  the  help  of  microprogramming  we  can 
create  systems  for  higher  level  languages  or  special  applications. 


which  utilize  existing  computer  hardware  to  a  much  higher  degree 
than  conventional  programming  systems. 

Among  the  thoughts  which  are  raised  by  this  scheme  are  the 
following: 

1  There  should  be  an  investigation  to  determine  the  ideal 
directly  interpretable  languages  which  correspond  to  higher 
level  languages.  Although  several  attempts  have  been  made 
to  define  string  languages  for  interpretive  systems  (for  in- 
stance in  Wirth  and  Weber  [1966a  and  1966b]  and  Mel- 
bourne and  Pugmire  [1965]).  to  the  author's  knowledge  no 
work  has  been  published  which  attacks  this  question  in  a 
general  and  theoretically  foimded  manner. 

2  A  proliferation  of  interpretive  languages  and  the  develop- 
ment of  microprogrammed  interpreters  can  be  justified 
when  better  tools  are  developed  to  reduce  the  cost  of 
microprogramming.  It  is  necessary  that  we  be  able  to  ex- 
press microprogramming  concepts  (and  also  machine  design 
concepts)  in  a  higher  level  language  form  and  that  we 
develop  compilers  which  translate  the  microprograms  from 
higher  level  language  form  to  actual  microcode.  Also,  good 
microprogram  simulation  and  debugging  tools  are  called  for. 

3  The  whole  relationship  between  programming,  micropro- 
gramming, and  machine  design  should  be  viewed  with  a 
common  denominator:  how  should  the  tradeoffs  be  made 
such  that  the  ultimate  goal  can  be  reached  more  efi^ec- 
tively, .  .  .  how  to  solve  a  user's  problem?  Green  [1966] 
offers  some  thinking  in  this  direction  but  the  state  of  the 
art  has  to  progress  further  before  we  will  have  a  complete 
understanding  of  what  these  relationships  and  tradeoffs  are. 

References 

WebeH67;  FaggP64;  GreeJ66;  HainL65;  MelbA65;  'WirtN66a,  66b;  FOR- 
TRAN Specifications  and  Operating  Procedures,  IBM1401,  IBM  Systems 
Ref.  Lib.  C24-1455-2. 


Part  5 
The  PMS  level 

This  part  presents  the  PMS  structure  dimension  of  the  computer  space.  The  sections 
are  arranged  in  order  of  increasing  organizational  structure  complexity.  The  sections 
are  as  follows;  1  Pc:  1  Pc  with  multiple  Pio;  multiprocessing  with  n  Pc;  parallel 
processing  with  n  Pc;  computers  which  are  networks;  and  networks  of  computers. 

In  Chap.  37  Lehman  defines  the  terms  multiprogramming,  multiprocessing,  and 
parallel  processing. 


Section  1 

Computers  with  one  central  processor 


The  computers  with  one  Pc  and  no  Pio's  control  T  and  Ms  in 
either  of  two  ways.  First,  the  Pc  contains  the  K  for  T  and  Ms; 
second,  a  separate  K  controls  a  data  transmission  while  Pc 
Initializes  the  K.  In  the  latter  case,  a  K  is  like  a  P  where  each 
instruction  Is  received  from  Pc  instead  of  being  fetched  auto- 
matically by  K  itself. 

The  Whirlwind  I  computer 

Whirlwind  (Chap.  6)  controls  data  transmissions  between  Ms 
or  T  and  Mp  by  using  Pc.  Thus,  arithmetic  and  input/output 


processing  concurrency  is  difficult  to  achieve.  The  structure  is 
first  discussed  in  Part  2,  Sec.  1,  page  90. 

The  SDS  910-9300  series 

The  SDS  910-9300  series  is  presented  in  Chap.  42  and  is  dis- 
cussed in  Part  6,  Sec.  2,  page  542.  The  input/output  and  the 
interrupt  system  are  especially  interesting. 


395 


Section  2 


Computers  with  one  central  processor 
and  multiple  input/output  processors 

The  computer  structures  discussed  in  this  section  are  manu- 
factured mainly  by  IBM.  The  reason  tor  this  bias  toward  IBM 
is  that  only  fairly  elaborate  or  very  specialized  structures  have 
Pio's;  computers  of  other  manufacturers  which  have  Pio's  tend 
to  have  also  the  more  general  multiprocessing  capability'  that 
would  place  them  in  Sec.  3. 

The  DEC  PDP-8 

The  PDP-8  is  presented  in  Chap.  5,  and  its  338  P. display  ap- 
pears in  Chap.  25.  Discussions  are  given  in  Part  2,  Sec.  1  and 
Part  4,  Sec.  1,  respectively.  For  this  section,  the  reader  should 
look  at  the  methods  for  transmitting  data  between  Ms  or  T  and 
Mp.  Three  methods  are  used:  Pio  or  P. display  is  used  to  control 
T. displays  (Chap.  25);  Pc  directly  transmits  a  word  to  the  buffer 
of  a  K  for  low  data  rate  devices,  here  a  K  may  request  data, 
using  the  program  interrupt;  and  a  K  transmits  data  directly 
to  Mp. 

The  IBM  1800 

Chapter  33  describes  the  lPc-9Pio  IBM  1800  computer.  There 
are  five  Pio  types,  depending  on  the  components  they  control. 
Although  we  classify  them  as  Pio's,  they  are  barely  processors 
since  the  instruction  counter  has  a  very  restricted  behavior. 
Unless  the  data  channel  has  "data  chaining"  capability  (in 
effect  a  jump  instruction),  it  is  not  a  processor. 

The  IBM  7094  II 

The  IBM  7094  II  computer  is  discussed  in  Part  6,  Sec.  1,  page 
515;  its  description  appears  in  Chap.  41.  The  earlier  709  was 
about  the  first  computer  to  use  independent  Pio's.  UNIVAC 
(Chap.  8)  has  a  very  extensive  K  for  data  transmission  con- 
current with  processing,  whereas  the  701  and  704  both  required 
Pc  to  control  each  data  word  transmitted.  The  Pio's  of  the  7094 
II  might  be  looked  at  as  an  overreaction  or  overdesign  inspired 
by  the  701-704. 

'For  example,  the  CDC- 3600  [Casale,  1962],  and  the  SDS  Sigma  7  [Mendelson 
and  England,  1966]. 


The  structure  of  System/360, 

Part  I— outline  of  the  logical  structure 

The  structure  of  the  360  is  presented  in  Part  6,  Sec.  3.  A  dis- 
cussion of  an  alternative  implementation  of  the  360  by  the  j 
authors  of  this  book,  using  multiprocessors,  is  given  (page  585). 
Chapter  43  gives  an  overview  of  the  ISP,  and  Chap.  44  presents 
the  implementations  of  various  360  models.  The  implementa- 
tions of  physical  processors  to  give  multiple  logical  processors 
using  microprogramming  are  interesting.  IBM  is  rather  conserv- 
ative in  regard  to  providing  structures  convenient  for  multi- 
programming; and  a  multiprocessing  design  appears  too  com- 
plex for  them  to  attempt  outside  a  research  environment. 

The  engineering  design  of  the  Stretch  computer 

Stretch  (also  known  as  Model  7030)  and  the  UNIVAC  LARC 
[Eckert,  et  al.,  1959]  are  perhaps  the  first  computers  with  the 
principal  design  goal  of  maximizing  numerical  computing 
power.  Stretch,  aptly  named  because  of  its  influence  on  the 
technology  (and  on  the  IBM  organization),  was  initiated  by  the 
Atomic  Energy  Commission  at  Los  Alamos.  It  was  designed  to 
interpret  large-scale  scientific  programs  for  nuclear  engineer- 
ing. Like  a  number  of  other  high-risk  major  developmental 
efforts  in  the  computer  field.  Stretch  was  not  outstandingly 
successful  as  a  computer  system.  Only  a  few  (5  ^  10)  were  built 
at  a  cost  substantially  exceeding  their  contract  price  and  with 
performance  only  modestly  better  than  the  art  at  the  time  of 
their  production.  However,  again  in  common  with  other  similar 
efforts,  they  had  a  substantial  positive  effect  on  the  state  of 
the  art.  In  the  Stretch  case,  in  particular,  the  2.18-microsecond 
Mp  core  technology  developed  for  Stretch  was  transferred  to 
the  7090.  In  fact,  this  was  a  major  contribution  to  why  Stretch 
was  only  modestly  better  than  7090.  The  design  goal  was  per- 
formance 100  times  an  IBM  704.  The  computer  is  described 
at  a  high  level  in  Chap.  34.  Buchholz's  book  on  Project  Stretch 
[Buchholz,  1962]  is  outstanding  as  a  text  on  computer  struc- 
tures and  as  a  description  of  Stretch.  It  should  be  read  by  all 
computer  designers. 

Computers  built  to  maximize  numerical  computing  power 
also  include,  besides  the  UNIVAC  LARC  for  the  Lawrence  Radia- 


396 


Section  2  J  Computers  with  one  central  processor  and  multiple  input/output  processors  397 


tion  Laboratory  at  LIvermore,  the  Control  Data  6600  (Chap.  39), 
and  the  IBM  System/360,  Models  91  and  85. 
Stretch  derives  its  power  through: 

1  Compound  and  complex  ISP  instructions 

2  A  PMS  structure  with  Mp(2.18  ms/w),Pc(0.25  -  1  jiis/w), 
Pio's,  and  a  satisfactory  switch  between  P's  and  Mp 

3  Many  data  types 

4  Parallelism  within  the  Pc,  involving  concurrent  interpre- 
tation of  the  instruction  stream  using  the  "Instruction 
look-ahead"  mechanism 

The  last  of  these,  internal  Pc  parallelism,  is  the  most  novel. 
Stretch  was  possibly  the  earliest  computer  to  make  use  of  it; 
each  of  the  other  "maximum"  power  C's  listed  above  also  uses 
some  version  of  instruction  look-ahead,  for  each  of  these 
"maximum"  systems  is  faced  with  how  to  obtain  computing 
power  that  goes  beyond  the  basic  logic  and  memory  technology 
available  at  the  time  the  system  is  designed.  The  conclusion, 
reached  in  all  these  cases,  is  to  move  toward  internal  paral- 
lelism. 

In  Stretch  the  instruction  look-ahead  mechanism  fetches  the 
next  several  instructions  and  partially  interprets  each  future 
instruction.  The  mechanism  is  elaborate  compared  with  the 
straightforward  instruction  stack  in  the  CDC  6600  (Chap.  39, 
page  489).  The  Stretch  look-ahead  complexity  stems  from  par- 
tially interpreting  instructions  which  may  later  have  to  be  un- 
done. 

Stretch  uses  a  basic  Mp(core;  16384  w;  (64  +  8  parity)  b/ w: 
tc:2.18  /.is).  Sixteen  Mp's  can  be  connected  to  the  P's  via  the 
S('Memory  Bus;  time  multiplexed).  The  8  parity  bits  are  used 
to  give  single-error  correction  and  double-error  detection,  which 
is  a  very  substantial  amount  of  error  protection  compared  with 
standard  design  practice.  This  is  the  memory  that  was  incor- 
porated in  the  IBM  7090  and  became  operational  even  before 
Stretch  was  delivered.  Thus,  as  is  often  the  case  with  large 
development  efforts,  the  by-products  are  as  important  as  the 
main  product. 

There  is  a  single  well-designed  physical  Pio,  called  the  Ex- 
change, consisting  of  several  logical  Pio's.  Its  ability  to  have 
the  state  of  all  the  logical  Pio's  accessible  in  Mp  is  useful  and 
important.  This  design  seems  better  than  the  data  channels 
in  the  IBM  709-7094  series.  It  is  almost  a  prototype  for  the  IBM 
System/360  Pio's. 

The  Stretch  word  length  is  64  bits.  It  has  operations  on  the 
following  data  types;  binary  integers,  decimal  integers,  address 


integers,  variable-length  integers,  boolean  vectors,  single  and 
double  floating  point.  The  length  of  the  variable  integer  is  speci- 
fied by  parameters  in  the  instruction.  Noisy-mode  floating-point 
data  provide  a  method  of  introducing  a  roundoff  error  in  the 
least  significant  bit  under  program  control.  Thus  a  problem  can 
be  run  in  conventional  and  noisy  modes  and  the  results  com- 
pared. An  instruction  is  either  32  or  64  bits. 

The  ISP  processor  state  has  an  instruction  counter,  a  dou- 
ble-length accumulator,  15  index  registers,  about  6  registers, 
and  about  100  miscellaneous  bits.  Computing  power  is  obtained 
by  having  an  instruction  set  with  complex  instructions.  Hence, 
there  is  an  instruction  for  almost  every  possible  operation, 
though  inverse  subtract  and  inverse  divide  instructions  are 
lacking.  However,  there  is  a  "multiply  and  add"  instruction. 
Stretch  has  the  complete  set  of  16  operators  for  boolean  vec- 
tors. Compound  instructions,  formed  from  a  sequence  of  sim- 
pler instructions,  also  increase  power.  These  instructions 
specify  the  array  element  to  be  accessed,  an  operation  on  the 
element,  and  a  calculation  to  get  the  next  element,  in  a  single 
instruction.  Notice  that  several  of  these  instructions  are 
oriented  toward  operations  on  arrays  (i.e..  matrices),  which  are 
the  type  of  numerical-analysis  tasks  for  which  the  system  was 
built. 

Multiprogramming  was  done  with  Stretch  [Codd  et  al.,  1959] 
and  undoubtedly  had  some  influence  within  IBM.  Stretch  has 
a  pair  of  bounds  registers  to  relocate  and  protect  a  single 
program.  The  interrupt  scheme  for  Stretch  [Brooks,  1957a]  was 
better  than  that  of  existing  IBM  computers,  though  it  is  not 
described  in  Chap.  34. 

The  importance  of  Stretch  lies  in  the  by-products  it  inspired 
and  its  influence  on  IBM,  encouraging  a  concern  with  hardware 
project  management.  The  elaborate  ISP  and  the  complex  im- 
plementation of  Stretch  may  not  have  been  worth  the  effort, 
especially  when  one  compares  this  computer  with  the  later, 
larger  but  elegant  CDC  6600.  It  is,  however,  interesting  to  note 
that  Stretch  was  used  as  a  central  component  in  an  early  spe- 
cialized multiprocessor  system  called  the  IBM  Harvest  [Herwitz 
and  Pomerene,  1960],  which  provides  extremely  powerful  data- 
processing  capabilities. 

PILOT,  the  NBS  multicomputer  system 

The  National  Bureau  of  Standards'  PILOT  computer  (Chap.  35) 
was  first  described  in  1959.  At  that  time  it  was  a  multiple 
computer;  by  our  criteria,  we  classify  it  as  a  multiple-processor 
computer,  as  shown  by  its  PMS  structure  (Fig.  1).  However, 


Part  5  I  The  PMS  level 


Section  2  [  Computers  with  one  central  processor  and  multiple  input/output  processors 


Mp(l  ps/w;  60  w;   16  b/w)  Pc  (' Secondary  Computer)  T. console- 

Mpp  us/w;  32768  w:  I  Tl  Pc('Pr!mary  Computer)  T. console- 


pH  us/w;  32768  wH  . 
[65  b/w  J 

read  only;  human  write; 
plugboard;   1  Lis/w;  6^4 
17  b/w 


hIT  ms/w;  72  b/w;~|_ 
Internal  StoreJ 


.  Pio( 'Thi  rd  Computer)- 


-Ms (magnetic  tape)  ■ 


— T (pri  nter) ■ 
— T(reader)  ■ 


Fig.  1.  National  Bureau  of  Standards'  PILOT  computer  PMS  diagram. 


unlike  present  multiprocessors  with  several  identical  proces- 
sors, each  PILOT  processor  is  different. 

PILOT  is  a  good  example  of  an  early  attempt  to  use  multi- 
processors; successors  look  little  like  it.  It  has  one  of  the  best 
analytical  discussions  of  any  computer  [Leiner  et  al.,  1957]. 
With  this  machine  there  was  an  attempt  to  resolve  the  contro- 
versy between  the  short-word  EDSAC  (17  bits)  and  the  long- 
word  Institute  for  Advanced  Studies  computers  (40  bits)  by 
providing  a  processor  and  memory  (i.e.,  computers)  for  each 
problem.  Only  the  first  computer  had  substantial  Mp,  and  the 
other  computers,  or  processors,  could  be  concerned  only  with 
the  first  computer.  The  third  computer  was  introduced  to  proc- 


ess devices  such  as  IVIs(magnetic  tape)  and  used  a  plugboard 
program  memory.  The  idea  of  an  independent  processor  (IBM 
7094)  or  computer  (CDC  6600)  for  input/output  processing  is 
used  now,  though  it  is  doubtful  that  PILOT  inspired  these  de- 
signs. 

The  capacitor-diode  store  is  novel  and  daring  for  the  tech- 
nology. Two-  and  three-address  computers  are  used  in  the  pri- 
mary and  secondary  computers.  The  secondary  computer,  with 
16-bit  words,  is  not  very  useful;  its  memory  is  very  limited,  and 
it  is  essentially  used  only  for  address  calculations.  The  book- 
keeping operation  for  a  three-address  computer  could  easily 
keep  a  small  processor  busy. 


Chapter  33 
The  IBM  1800 


Introduction 

This  third-generation  computer  is  constnicted  with  hybrid-circuit 
technology  (semiconductors  bonded  to  ceramic  substrates)  known 
as  SLT  (Solid  Logic  Technology).  It  has  a  core  primary  niemorv. 

The  1800  is  designed  for  process  control  and  real-time  applica- 
tions. It  is  nearly  identical  to  the  IBM  1 130,  which  is  designed 
for  small-scale,  general-purpose,  and  scientific  calculation  appli- 
cations. The  two  C's  perform  about  the  same  for  computation 
bound  problems.  The  1130  and  1800  are  not  program  compatible 
with  the  "universal"  IBM  System/360  series,  though  introduced 
at  about  the  same  time.  However,  the  1800  uses  terminals  and 
secondary  memories  similar  or  identical  to  the  System  '360.  These 
are  organized  about  the  standard  IBM  System/360  8-bit  bvte.  Thus 
their  common  information  media  provide  a  link  between  the  two. 
Hence  an  1800  is  sometimes  connected  to  the  System/360  as  a 
preprocessor.  The  relative  performance  of  the  IBM  1130,  1800, 
and  the  IBM  System/.360  can  be  seen  on  page  586.  The  18(M)  has 
a  better  cost/performance  ratio  than  a  System/360,  Model  40  and 
has  the  performance  of  a  Model  30.  From  now  on  we  will  refer 
only  to  the  IBM  1800,  although  much  applies  to  the  IBM  1 130. 

The  1800's  interface  facilities  include  a  large  number  of  T's 
which  can  connect  to  different  physical  processes;  a  multiple 
priority  interrupt  facility  with  fast  response;  multiple  Pio's  which 
can  transfer  information  at  high  data  rates;'  and  a  complete 
instniction  set  for  real-time,  nonarithmetic  processing. 

We  include  the  ISOO  because  it  is  a  typical,  16-bit,  real-time, 
process  control  computer.  The  ISP  is  the  most  straightforward  of 
the  IBM  computers  in  the  book  (and  perhaps  the  nicest).  The 
several  different  Pio's  and  their  implementations  are  unusual  and 
should  be  carefully  studied.  Important  aspects  of  the  1800  include 
the  PMS  structure  as  it  links  to  real-time  processes,  e.g.,  analog 
processes;  the  straightforward  Pc  ISP  (Appendi.x  1  of  this  chapter); 
the  specialized  Pio's  for  real-time  T's;  the  Pc  implementation;  and 
the  Pio  implementation.  The  chapter  is  written  to  expose  and 
explain  these  aspects. - 

By  comparing  the  1800  with  Whirlwind,  an  evolutionary  pro- 
gression can  be  seen.  Their  ISP's  are  similar  but,  because  of  better 

'Although  we  refer  to  the  data  channels  as  Pio's,  they  have  a  ver\  hniited 
ISP  for  a  Pio;  in  fact,  they  might  better  be  called  K's. 
^Sonie  of  the  material  in  the  chapter  has  been  abstracted  from  the  IBM 
18(X)  Functional  Characteristics  Manual. 


technology,  the  1800  shows  an  increase  in  capability.  The  18()() 
Pc  has  a  medium-sized  state  (ISP  has  six  registers)  including  three 
index  registers.  The  implementation  is  not  elegant;  a  single  register 
array  and  adder  would  provide  the  basis  for  a  straightforward  Pc 
implementation.  The  1800  has  features  which  facilitate  higher 
information  processing  rates  compared  with  Whirlwind.  The  major 
change  between  Whirlwind  and  the  1800  machines  was  brought 
about  by  the  decreasing  cost  of  registers  and  primary  memory. 
In  the  18(K),  all  K's  have  independent  memory  (usually  1  ~  2 
words  or  characters)  so  that  concurrent  operation  of  almost  all 
the  T  and  Ms  via  their  K's  is  possible.  In  contrast.  Whirlwind  has 
only  a  single,  shared  register  in  Pc,  and  only  one  device  can 
operate  at  a  time. 

Lower  hardware  costs  allow  multiple  Pio's  in  the  1800.  The 
Pio's  represent  an  unusual  approach  to  information  processing  in 
this  period.  The  Pio's  which  process  standard  disk,  magnetic  tape, 
and  card  reader  are  conventional,  but  the  Pio's  for  analog  and 
process  signals  are  novel  and  interesting.  The  latter  Pio's  are  the 
most  unusual  part  of  the  1800,  and  they  allow  independent  pro- 
grams in  each  Pio  to  do  some  very  trivial  processing  tasks  such 
as  alarm-condition  monitoring  independent  of  Pc.  However,  the 
Pio's  are  limited;  for  example,  it  is  difficult  to  transmit  or  receive 
a  data  block  between  Ms  and  Mp  (using  a  Pio)  without  surrounding 
the  data  block  with  Pio  control  words  (thereby  transmitting  the 
control  words). 

The  interrupt  system  is  t\  pical  of  second-  and  third-generation 
computers  and  is  comparable  to  the  SDS  900  series  (Chap.  42). 
In  later  computers  interrupt  conditions  are  used  to  determine  a 
fixed  address  to  which  the  processor  interrupts.  Tliere  are  generally 
many  conditions  (100  to  1,000),  but  only  a  few  discrete  levels  (8 
to  20).  The  1800  depends  on  program  polling  within  a  discrete 
internipt  level;  each  level  has  a  unique,  fixed  address. 

A  principal  ISP  design  problem  is  the  addressing  of  the  6.5,536- 
word  Mp.  Thus,  a  16-bit  number  has  to  be  generated  within  Pc 
for  an  address.  In  this  regard  the  1800  behaves  like  the  12-bit 
machines  which  have  to  address  a  2'-  (4,096)  word  memory,  and 
the  modes  or  methods  the  1800  uses  for  addressing  are  reasonable. 
It  should  be  noted  that  it  is  relatively  difficult  to  write  programs 
which  do  not  modif\-  themselves.  For  example,  the  instniction. 
Store  Status,  is  changed  by  its  execution. 


400  Part  5  I  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


The  central  processor^ — primary  memory 

The  IBM  1800  is  a  fixed-word-length,  binary  computer  with  4,  8, 
16,  or  32-kword  memories  of  16  -|-  1  -(-  1  bits,  and  a  memory  cycle 
time  of  2  or  4  microseconds.  Of  the  18  bits  1  bit  is  used  as  a  parity 
check  (P  bit)  and  1  bit  is  used  for  storage  protection  (S  bit).  The 
Pc  instruction  set  operates  on  16-bit  and  32-bit  words.  Indirect 
addressing  and  three  index  registers  are  used  in  address  modifica- 
tion. The  Pc  has  a  24-level  internipt  system,  three  interval  timers, 
and  a  console. 

The  Pc  interrupt  is  a  forced  branch  (jump)  in  the  normal 
program  sequence  based  upon  external  or  internal  Pc  conditions. 
The  devices  and  conditions  that  cause  interrupts  are  hardwired 
in  fixed  priority  levels.  An  interrupt  request  is  not  honored  while 
the  level  of  the  request  itself  or  any  higher  level  is  being  serviced, 
or  if  the  level  requested  is  masked.  Examples  of  interrupt  condi- 
tions are: 

1    An  external  process  condition  that  requires  attention  is 
detected. 

'IBM  name:  the  Processor-Controller  or  PC. 


Coniole 

Entf>  & 
□  (jploy 

P-C 

PROCE5SO 
CONTROL 

_  in  B„i 

R  - 
LER 

1 

1  Analog  Input  Point) 

1  lU 

Anolog-ro- 

PROCESS 

DIgito 

HI 
i 

I/O 

\\\ 

Voltoge/ 
Contact 

1 

1 

1 

DATA  PROCESS 
1054 

Pr  Reader 

1 

NG  I/O 

j 

^   Oui  Bui 

Channel 
Control 

'III!  II 

1816                                                                                                                         1       S>ilem  340 
°                            IU2                      2401,  2402                2310                                       hlod  30  4  0  44 
I^IZ.J                                                                                                                  '  »■ 

.  !         +         1         +  It 

r     1      1      i      1    I      i      i      ;      ;  1 

Electronic 
"Contocf 
1  Operote 

1  HI 

1  Digital 

.  Proceii  Inlertupt   Stotui  Vi 
1      grouped  with  Digitol  Inpu 

1      D.gi.ol  lr,puli 

Pulie 
Output 

ill 

ond  Anolog 
ordi  ore 
ode  oi 

Ill 

Output  Po 

Oigifal-to- 

1 

1053 

1  Printer 

1055 

pr  Punch 

1443  1627 

Printer                   Plotter  | 

1 

A  peculiar  feature  of  the  1800  is  its  storage  protection  (see  page 
408).  This  feature  should  provide  program  relocation  capability 
in  addition  to  protection,  but  it  does  not. 

PMS  structure 

A  simplified  picture  of  the  IBM  1800  structure  is  given  in  Fig. 
1,  without  Pio('Data  Channel)'s  and  K('Device  Adapter)'s.  Each 
T  and  Ms  have  a  K  which  connects  Pc's  In  and  Out  Bus,  the  S('Pc 
to  K).  Some  K's  attach  to  Pio's  and  some  directly  to  Pc.  Information 
can  be  transferred  between  Mp  and  K  via  Pio  at  rates  up  to  0.5 
megaword/s  or  8  megabits/s.  The  IBM  Configurator  (Fig.  2)  gives 
the  restrictions  on  the  possible  structures,  together  with  minute 
L  details.  It  is  presented  as  an  alternative  to  the  PMS  structure 
(Fig.  1).  The  Configurator  is  intended  to  show  the  "permissible 
structures"  but  does  not  show  the  logical  or  physical  structure. 
The  PMS  diagram  (Fig.  3)  alternatively  shows  the  physical-logical 
hardware  structure  and  performance  parameters.  It  should  be 
noted  that  a  PMS  diagram  with  the  information  of  the  computer 
component  Configurator  (Fig.  2)  would  require  slightly  more  de- 
tails (and  space). 


Fig.  1.  IBM  1800  data  acquisition  and  control  system.  (Courtesy  of  International  Business  /Wachmes  Corporation.) 


Chapter  33  ;  The  IBM  1800  401 


Fig.  2.  IBM  1800  data-acquisition  and  control-system  configurator. 
(Courtesy  of  /nfernationa/  Business  Machines  Corporation.) 


402  Part  5  |  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


Chapter  33  j  The  IBM  1800  403 


aNALOG  INPUTS 


404  Part  5     The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


-  P !  o   

-Pio   

_Pio(#l  :3)- 


1—  K  - 
.  K. 
L_  K 


-K(tlme)*- 
_  T  (# I :   typewr  i  ter) - 
-T(#2:A;   page;  printer)-* 
- T  (#5 ;   typewr  i  ter) - 
T(#6:R;  paqe ;  printer)-' 
T  { i  ncremental  point  plot)—* 
-Kpaper  tape;   reader  (punch) - 
■  T{card;   reader]  punch)- 
-M5(#l:2;  magnetic  tape)- 
-Ms (removable ;d  i  skpak)- 


-  Pio  - 

.Pio^. 


I—  K(M:6)- 


-K{#1  :2) —  S  — KT 


■  K(#1  -.h)  S— KT 


T { ' $YS tem/360  interface)- 
KTr^l  :R;  diqi  tal  ;   input;  1 
Lcontacts [ I og i c  voltage 
KTrSiqital;  event  pulse: 
Input;  counters; 
.(/'1:16:  8b)|(#l:R;   16  b) 
Tl  :    ;  digital:  contact 
pulse;   inputs;   to:  Inter 
Xupt;   16  b 

■#1  :  l);  dig  I  tal  ;  output : 
contact  1 1 og i c  voltage[ 
pulse:   16  b 

analog;  output; 
3  b 

#1  :~102'l;  analog  ;   I  nput ; 
voltage,  current;  (+10| 
+20 1+50 1  +  1 00  I +200 1+500) 
mv|+5  v|+10  v|  (~20)ma) 


rRl  :1|: 
[lO|13 


' Mp (core ;  2|1|  M-s/w:  11096  ~  3276B  w:    (16,  parity,  protect)  b/w) 

^PcClBOl  |l802;   1  ~  2  w/Instruction:   technology:  liybrid:  Mp5(~6w):   1  address/ 

Instruction:  ~1965) 
^SC  In  Bus,  Out  Bus) 

Maximum  of  9  Pio  per  C 
^Pio( ' DIgi tal    Input  Data  Channel) 
'^PIo  ( 'Digl  tal  ,  Analog  Output  Data  Channel) 
''Piot'Analog   Input  Data  Channel) 

^Optional  Pio  to  control   analog  channe 1 ; (s t rue t ure   is  nreatly  simplified) 
'kCADC;  analog;   Input:  9,   12,   15  b/w;   I. rate:   9  ~  24  kw/s) 


Fig.  3.  IBM  1800  PMS  diagram  (simplified). 


Chapter  33  !  The  IBM  1800  405 


2  An  interval  timer  has  counted  a  previously  set  time  interval. 

3  A  magnetic-tape  drive  has  completed  a  data  transfer  previ- 
ously requested  and  is  ready  for  another  request. 

4  An  operator  has  initiated  an  interrupt  from  the  Pc  console. 

5  A  device  such  as  a  typewriter  has  just  printed  a  character 
and  is  ready  to  receive  the  next  one. 

Primary-memory  communication  and  data  traruwiission  with 
terminals  and  secondary  memory 

Two  methods  are  used  to  transmit  data  between  Mp  and  Ms,  or 
Mp  and  T.  First,  low-speed  devices  are  controlled  directly  by 
the  program.  Each  character  or  word  of  data  is  transmitted  to  or 
from  the  Pc  and  onto  T  by  means  of  an  Execute  I/0(XIO)  instnic- 
tion.  The  Pc  program  and  device  synchronization  are  accomplished 
bv  using  the  interrupt  mechanism.  Devices  operating  under  direct 
program  control  include  typewriter,  printer,  plotter,  paper  tape 
reader  and  punch,  analog-to-digital  converters,  contact  sense, 
voltage-level  sense,  pulse  counters,  etc. 

The  second  method  of  transferring  data  is  via  the  Pio('Data 
Channel)'s.  The  Pio  program  is  started  by  the  XIO  instruction  of 
the  Pc.  The  transfer  of  data  words  then  proceeds  under  control 
of  the  specified  Pio,  conipletelv  asvnchronous  to  and  in  parallel 
with  Pc  program  operation.  The  Pio  gains  Mp  access  independent 
of  Pc  (Pc  operation  is  suspended  for  one  Mp  cycle).  During  the 
Mp  cycle,  the  data  are  taken  from  or  placed  into  core  storage  by 
Pio  (via  internal  Pc  control  and  registers).  As  soon  as  the  Pio  has 
been  satisfied,  which  normally  takes  one  cycle,  the  Pc  proceeds. 
The  logical  state  of  the  Pc,  or  the  Instruction-set  Processor,  is  not 
changed  by  Pio's  access  to  Mp.  This  method  of  access  is  referred 
to  as  "cycle  stealing."  Devices  (Ms  and  T)  operating  under  Pio 
control  include  magnetic  tapes,  disks,  line  printer,  card  reader- 
punch,  and  the  link  to  the  IBM  Systeni/36(). 

Some  devices  can  operate  under  both  Pc  and  Pio  control, 
depending  on  their  characteristics  and  the  configuration,  e.g., 
analog  input,  analog  output,  digital  input,  and  digital  output. 

Process  I/O,  controls  and  transducers 

Analog  inputs,  .\nalog-input  equipment  includes  analog-to-digital 
converters,  multiplexors,  amplifiers,  and  signal  conditioning  equip- 
ment to  handle  various  analog-input  signals.  The  data  input  rates 
are  up  to  20,000  16-bit  samples  per  second,  with  program  selecta- 
ble resolution  and  external  synchronization.  There  can  be  1,024 
(via  relay)  and  256  (via  high-speed  solid  state)  multiplexed  analog- 
input  channels  connected  to  a  single  K  (analog-to-digital  con- 
verter). The  Configurator  (Fig.  2)  shows  the  allowable  inputs. 


Digital  inputs.  The  Digital  Input  provides  up  to  384  process  in- 
terrupts; up  to  1,024  bits  of  contact  sense,  digital  input,  or  parallel 
register  input;  and  128  bits  of  event  input  counters  as  1-,  8-,  and 
16-bit  counting  registers. 

.Xnalog  outputs.  Up  to  128  analog  outputs  can  be  provided. 

Digital  outputs.  Digital  Outputs  provide  up  to  2,048  bits  of  pulse 
output,  contacts,  and  registers. 

lO  processors  (data  channels) 

Pio('Data  Channels)  give  a  T  or  Ms  the  ability  to  communicate 
directly  with  Mp.  For  example,  if  an  input  unit  requires  a  primary 
memory  cycle  to  store  data  that  it  has  collected,  the  Pio  communi- 
cates directly  with  Mp  and  stores  the  data. 

The  Pio's  run  even  if  Pc  is  waiting.  The  Pio's  have  two  registers: 
a  Word  Count  which  is  used  to  count  the  number  of  words  being 
transferred  in  a  block  between  a  device  and  Mp  memory;  and  a 
Channel  Address  which  points  to  the  next  word  transferred  in  a 
block.  The  Channel  Address  is  also  used  to  select  the  next  instruc- 
tion in  the  program  for  the  next  block  transfer  task. 

Two  basic  types  of  Pio's  are  used,  nonchaining  and  chaining.^ 
The  Pio's  provide  the  ability  to  transfer  either  a  single  block 
(nonchaining)  or  multiple  blocks  (chaining)  directly  to  Mp  inde- 
pendent of  Pc. 

The  central  processor 

Registers  in  the  physical  processor 

Figure  4  shows  the  relationship  of  the  registers  in  Pc,  together 
with  those  in  the  Instruction-set  Processor.  Those  registers  acces- 
sible by  the  program  are  shown  with  an  °.  All  the  registers  are 
accessible  from  the  console.  .\  description  of  the  hmctions  of  each 
register  is  given  below. 

Storage  address  register  (SAR).  AW  Pc  references  to  Mp  are  selected 
or  accessed  by  this  16-bit  register.  Pio  references  to  .\lp  use  the 
Channel  Address  Register  (CAR)  of  the  active  Pio. 

Instruction  register  (/)°.  This  16-bit  counter  register  holds  the 
address  of  the  ne.xt  instniction. 

Storage  buffer  register  (B).  This  16-bit  register  is  used  for  buffering 
all  word  transfers  with  Mp. 

'  -\  descriptive  name  undoubtedly  concocted  by  one  of  IBM  s  marketing 
departments. 


406  Part  5  |  The  PMS  level 


Section  2  j  Computers  with  one  central  processor  and  multiple  input/output  processors 


OP 

(5) 

^registers  accessible  to  Instruction  Set  Processor 
^*aIlows  processor  registers  to  be  read  or  written 

Fig.  4.  IBM  1800  Pc  data  flow.  (Courtesy  of  International  Business  Machines  Corporation.) 


Arithmetic  factor  register  (D).  This  16-bit  register  is  used  to  hold 
one  operand  for  arithmetic  and  logical  operations.  The  Accumu- 
lator provides  the  other  factor. 

Accumulator  (A)°.  This  16-bit  register  contains  the  results  of  any 
arithmetic  operation.  It  can  be  loaded  from  or  stored  into  core 
storage,  shifted  right  or  left,  and  otherwise  manipulated  by  specific 
arithmetic  and  logical  instructions. 

Accumulator  extension  {Q)° ■  This  register  is  a  16-bit  low-order 
extension  of  the  Accumulator.  It  is  used  during  multiply,  divide, 
shifting,  and  double-precision  arithmetic. 


Shift  control  counter  (SC).  This  6-bit  counter  is  used  primarily  to 
control  shift  operations. 

Accttmulator  temporary  (IJ).  The  U  register  is  used  to  store  A 
temporarily  during  an  instruction  or  an  operation  which  requires 
the  A  s  facilities. 

OP  register  (OP).  This  5-bit  register  is  used  to  hold  the  operation 
code  portion  of  an  instruction. 

Index  registers'.  The  three  16-bit  registers  are  used  in  effective- 
address  calculations. 


Chapter  33  ,  The  IBM  1800  407 


Overfknv  and  carry  indicators'' .  The  two  indicator  hits  associated 
with  the  Accumulator  are  Overflow  and  Carry.  The  Overflow 
indicator  can  be  turned  on  by  Add,  Subtract,  or  Divide  instruction 
and  indicates  a  result  larger  than  can  be  represented  in  the  Accu- 
mulator, The  Overflow  indicator  can  also  be  turned  on  by  a  Load- 
status  instruction.  Once  Overflow  is  on,  it  will  not  be  changed 
except  by  testing  the  indicator,  or  by  a  Load-status  or  Store-status 
instruction.  The  Carry  indicator  provides  the  information  that  a 
carry  (or  borrow)  from  the  high-order  position  of  tlie  .\ccmiiuia- 
tor  has  occurred. 

The  Carry  indicator  is  used  with  the  Add,  Subtract.  Shift-left. 
Load-status,  Store-status,  and  Compare  instructions. 

In-hus.  This  18-bit  bus  is  a  link(L)  used  to  carry  information  from 
a  K  to  Pc.  Generally  only  16  of  the  IS  bits  are  used,  although 
transfers  to  magnetic  tape  can  be  made  three  6-bi(  characters. 

Out-hii.s.  This  18-bit  bvis  is  used  to  carr\  information  froiTi  I'c  to 
a  K. 

Instruction-set  processor 

The  operation  of  the  Pc  from  a  program  viewpoint  follows.  The 
ISP  registers  were  declared  ( " )  in  the  previous  .section  and  in  Fig. 
4.  The  ISP  registers  are  the  16-bit  I,  A,  Q,  XR  [I,  2,  .3],  and  the 
1-bit  Overflow  and  Carry. 

An  ISP  description  of  the  1800  appears  in  .\ppendi.x  1  of  this 
chapter.  It  is  incomplete  in  the  following  respects;  The  memorv 
protect  bit  checking  is  not  described;  the  illegal  (undefined)  in- 
struction action  is  not  described;  double  word  data  must  be  aligned 
on  even  and  odd  address  word  boundaries  or  else  a  fault  occurs; 
and  the  lO  instniction  and  interrupt  operation  are  not  given. 

Instruction  formats.  Two  basic  instruction-word  formats  are  used, 
one  word  (Fig.  5)  and  two  word  (Fig.  6).  The  bits  within  the 
instniction  words  are  used  in  the  following  manner: 


OP 


Operation  Code.  These  .5  bits  define  the  instruc- 
tion. 


0  4 

a  9  10 

IS  0 

li 

1  .  P?  . 

.  1 , , , 

.  .   1 

Fig.  5.  IBM  1800  one-word-instruction  format.  (Courtesy  of  Inter- 
national  Business  Machines  Corporation.) 


Fig.  6.  IBM  1800  two-word-instruction  format.  (Courtesy  of  Inter- 
national Business  Machines  Corporation.) 


F  Format  i)it.  .\  0  indicates  a  single-word  instruc- 

tion, and  1  a  two-word  instruction. 

T  Tag.  These  2  bits  specifv  which  of  the  three  index 

registers  is  used  in  address  modification  or  the  shift 
count. 

DISP  Displacement.  These  8  bits  are  usuallv  added  to 

the  instruction  register  or  the  index  register  speci- 
fied bv  T  for  one-word  instructions.  The  modified 
address  is  defined  as  the  Effective  .Xddress  (E.^). 
If  T  is  (K),  the  displacement  is  added  to  the  in- 
stniction register  (then  E.\  =  1-1-  DISP).  The 
displacement  is  in  two's  complement  form  if  nega- 
tive, with  the  sign  in  bit  8.  The  bit  in  position 
8  is  automaticallv  extended  to  the  higher-ordered 
bits  (0  to  7)  when  the  displacement  is  used  in  EA 
generation. 

I.-\  Indirect  addressing.  This  bit  is  used  only  in  the 

two-word-instniction  format.  If  0.  addressing  will 
be  direct.  If  a  1.  addressing  will  be  indirect.  Onlv 
one  level  of  indirect  addressing  is  permitted.  (The 
Load  Index  and  Modif\  Index  and  Skip  instnic- 
tions  have  exceptions,  as  shown  in  the  ISP  descrip- 
tion.) 

BO  Branch  Out.  This  bit  is  used  to  specify  that  the 

Branch  or  Skip  on  Condition  (BSC)  instniction  is 
to  be  interpreted  as  a  Branch  Out  (BOSC)  when 
used  in  an  internipt  routine. 

CO\D  Conditions.  These  6  bits  select  the  indicators  that 
are  to  be  interrogated  on  a  BSC  or  BSI  instruction. 
The  bit  assignments  for  conditions  are: 

Cond<10>  A  =  0 

Cond<ll>  A<0 

Cond<12>  A>0 

Cond<13>  (A<15>  =  0)  that  is,  A  is  even 

Cond<14>  (Carry  =  0) 

Cond<15>  (Overflow  =  0) 

.\DDRESS    These  16  bits  usually  specif)-  a  core  storage  address 


408  Part  5  |  The  PMS  level 


Section  2     Computers  with  one  central  processor  and  multiple  Input/output  processors 


Table  1    Determining  effective  addresses 


F  =  0 

(F  =  1)  A  (lA  =  0) 

(F  =  J)a  (M  =  J) 

(direct  addressing}^ 

(direct  addressing) 

(indirect  addressing) 

T  =  00 

EA  ^  1  +  Dispt 

EA  ^  Address 

EA  ^  C(Address)§ 

T  =  01 

EA  ^  XR[1]  +  Disp 

EA  ^  Address  +  XR[1] 

EA  ~  C(Address  +  XR[1]) 

T  =  10 

EA  ^  XR[2]  +  Disp 

EA  ^  Address  +  XR[2] 

EA  ^  C(Address  +  XR[2]) 

T  =  11 

EA  ^  XR[3]  +  Disp 

EA  ^  Address  +  XR[3] 

EA  ^  C(Address  +  XR[3]) 

t  Contents  of  instruction  register  (I)  or  index  register  (XR[1],  XR[2],  XR[3]). 
J  May  be  true  positive  quantity  or  negative  two's  complement  quantity. 

§  C  specifies  "contents"  at  location  specified  by  Address  or  Address  +  XR[1].  XR[2],  or  XR[3]. 


in  a  two-word  instruction.  The  address  can  be 
modified  by  the  contents  of  an  index  register  or 
used  as  an  indirect  address  if  the  lA  bit  is  on. 

Effective-address  generation.  The  Effective  Address  (EA)  is  devel- 
oped as  shown  in  Table  1.  The  instruction  set  is  divided  into  five 
classes  as  shown  in  Table  2. 

Storage  protection.  The  storage-protection  facility  protects  the 
contents  of  specified  individual  locations  of  Mp  from  change  due 
to  the  erroneous  storing  of  information  during  the  execution  of 
a  program.  The  status  of  each  location  is  identified  as  "read  only"' 
or  "read/write"  by  the  condition  of  the  Storage  Protect  Bit,  S. 

The  Store-status  instruction  is  used  to  write  and  clear  Storage 
Protect  Bits.  The  execution  of  this  instruction  is  under  control  of 
the  Write  Storage  Protect  Bits  switch  on  the  console.  Any  attempt 
by  the  program  to  write  into  a  read-onlv  protected  location  results 
in  a  storage-protect  violation  which  causes  the  Internal  Interrupt 
(the  highest  priority  internipt). 

Instrtiction  interpretation  process 

The  simplified  Pc  data-flow  block  diagram  (Fig.  4)  shows  instruc- 
tions and  data  entering  and  leaving  memory  via  the  B  register. 
Additional  bits  in  Pc  hold  the  P  and  S  bits  for  Mp.  Input  devices 
send  data  and  instructions  to  the  B  register  via  the  18-bit  In-bus. 
Output  devices  receive  data  from  the  B  register  via  the  18-bit 
Out-bus.  Eighteen  bits  can  be  transferred  between  Pc  and  K(mag- 
netic  tape).  As  each  stored-program  instrxiction  is  selected,  its 
various  parts  (op  code,  format  bit,  etc.)  are  directed  to  the  control 
registers  via  the  B  register  and  the  Out-bus.  The  control  registers 
decode  and  interpret  each  instruction  before  the  instruction  is 
executed. 

Except  for  Pio  operations,  all  instructions  and  data  in  memory 
are  addressed  by  the  Storage  Address  Register  (S.\R).  SAR  obtains 
the  memory  address  from  the  I  register  or  the  A  register.  The 


Table  2    Instruction  set 


Class 

Iratntction 

Indirect 
addressing 

Mnemonic 

Lo3d  and 

Load  accumulator 

Yes 

LD 

store 

Double  load 

Yes 

LDD 

Store  accumulator 

Yes 

STO 

Double  store 

Yes 

STD 

Load  index 

t 

LDX 

Store  index 

Yes 

STX 

Load  status 

No 

LDS 

Store  status 

Yes 

STS 

Anthmstic 

Add 

Yes 

A 

Double  add 

Yes 

AD 

Subtract 

Yes 

S 

riniiKlo  ciihtra/-t 

Yes 

SD 

Multiply 

Yes 

M 

Divide 

Yes 

D 

And 

Yes 

AND 

Or 

Yes 

OR 

Exclusive  Or 

Yes 

EOR 

Shift 

Shift  Left  instructions: 

Shift  left  logical  (A)t 

No 

SLA 

Shift  left  logical  (AQ)t 

No 

SLT 

Shift  left  and  count  (AQ)t 

No 

SLC 

Shift  left  and  count  (A)t 

No 

SLCA 

Shift  Right  instructions: 

Shift  right  logical  (A)t 

No 

SRA 

Shift  right  arithmetically  (AQ)t 

No 

SRT 

Rotate  right  (AQ)« 

No 

RTE 

Branch 

Branch  and  store  1 

Yes 

BSI 

Branch  or  skip  on  condition 

Yes 

BSC  (BOSC) 

Modify  index  and  skip 

t 

MDX 

Wait 

No 

WAIT 

Compare 

Yes 

CMP 

Double  compare 

Yes 

DCM 

1  0 

Execute  1  0 

Yes 

XIO 

t  Letters  in  parentheses  indicate  registers  involved  in  shift  operations. 


I  See  the  section  for  the  individual  instruction  (MDX  and  LDX), 


Chapter  33  '  The  IBM  1800  409 


contents  of  the  I  register  are  developed  by  one  of  the  following 
means,  depending  on  the  Pc  operation: 

1  The  I  register  is  incremented  for  each  instruction. 

2  The  effective  address  of  each  instmction  is  developed  in 
the  accumulator  (A  register)  and  then  transferred  to  SAR. 
The  contents  of  the  accumulator  are  saved  in  an  auxiliary 
(U)  register  during  effective-address  computation.  If  the 
instmction  was  a  branch,  the  contents  of  S.\R  is  transferred 
to  the  I  register. 

The  following  examples  illustrate  the  data  flow  or  instruction 
interpretation  process  for  the  Load  .Accumulator  (LD)  instmction. 

One-word  load  instruction 
Instmction  Cycle 

1  A  register  transfers  to  U  register. 

2  I  register  transfers  to  SAR  (I  register  is  then  incremented). 

3  SAR  addresses  the  memory  location  containing  the  instruc- 
tion. 

4  Memory  location  transfers  to  the  B  register  and  Out-bus. 

5  Control  registers  store  various  parts  of  the  instmction  (op 
code,  format,  and  tag). 

6  Displacement  is  stored  in  the  D  register. 

7  a    If  tag  =  00,  I  register  transfers  to  A  register. 

b    If  tag  7^  00,  the  specified  XR  transfers  to  .\  register. 

8  Displacement  (D  register)  is  added  to  A  register. 
Execute  Cycle 

9  A  register  transfers  to  S,\R  (effective  address). 

10  U  register  transfers  to  A  register. 

11  SAR  addresses  data  word. 

12  Data  word  transfers  to  B  register. 

13  B  register  loads  into  A  register  (via  D  register). 

Two-word  load  instruction,  direct  addressing 
Instmction  Cycle  1 

1  A  register  transfers  to  U  register. 

2  I  register  transfers  to  S.\R  (I  register  is  then  incremented). 


3  SAR  addresses  the  memory  location  containing  the  instmc- 
tion (first  word). 

4  Memory  location  transfers  to  B  register  and  Out-bus. 

5  Control  registers  store  various  parts  of  the  instmction  (op 
code,  format,  and  tag). 

6  If  tag     (K),  the  specified  XR  transfers  to  A  register. 
Instmction  Cycle  2 

T  I  register  transfers  to  SAR  (I  register  is  then  incremented). 

8  S.\R  addresses  second  word  of  instmction. 

9  Second  word  of  instmction  (address)  is  read  into  B  register. 

10  .\ddress  (from  B  register)  is  stored  in  D  register. 

11  (1    If  tag  =  (M),  D  register  transfers  to  A  register. 

h    If  tag  =^  00,  D  register  is  added  to  A  register  (A  register 
contains  contents  of  XR). 

Execute  Cycle 

12  .\  register  transfers  to  S.\R  (effective  address). 

13  U  register  transfers  to  A  register. 

14  S.\R  addresses  memory  at  effective  address  (data  word). 

15  Data  word  transfers  to  B  register. 

16  B  register  loads  into  A  register  (through  D  register). 

Central-processor  communication  with  the  controls' 

Direct  program  control  of  the  controls 

Pc  direct  programmed  control  of  I/O  devices  is  on  the  basis  of 
single-word  or  character-at-a-time  transfers  for  each  XIO  instmc- 
tion executed.  One  data  word  or  character  is  transferred  to  or  from 
Mp  to  K.  The  XIO  instmction  specifies  an  I/O  Control  Command 
(lOCC)  with  a  function  of  Control.  Sense,  Read,  or  Write  to  a 
controlled  device.  This  command  is  either  directlv  to  a  device  or 
to  a  Pio. 

It  is  possible  for  the  program  sequence  to  execute  an  XIO 
instmction  to  a  device  that  is  busy  responding  to  a  previous  XIO 
instmction.  Each  device  has  a  Busy  indicator,  which  signals 
whether  or  not  the  device  can  accept  data  or  control  information. 
(Incorrect  program  sequence  timing  may  cause  undetected  errors.) 

'IBM  name:  .\dapter  or  De\ice  .\dapter. 


410  Part  5  |  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


It  is  possible  for  a  device  operating  synchronously  with  the 
program  to  request  a  data  word  transfer  before  the  program 
sequence  is  ready  to  service  the  request.  Devices  with  this  poten- 
tial have  a  "program  check"  indicator  to  signal  when  data  have 
been  lost  (that  is,  Pc  has  not  kept  up  with  the  device). 

Execute  1/ O  instruction  (X/O) 

This  instruction  is  used  for  programmed  I/O  operations  and  to 
initialize  Pio;  it  may  be  either  one  or  two  words  in  length,  as 
specified  by  the  F  bit.  In  the  two-word  instruction  the  address 
is  either  a  direct  or  indirect  address,  as  specified  by  the  lA  bit. 
For  proper  operation  the  effective  address  must  be  an  even  ad- 
dress. The  effective  address  is  used  to  select  a  two-word  I/O 
Control  Command  (lOCC)  from  storage. 

The  lOCC  specifies  the  I/O  operation,  I/O  device,  and  core 
storage  address.  The  format  of  the  two-word  lOCC  follows,  with 
an  explanation  of  the  assigned  fields: 

Area  .  =  IOCC[1]{0:4}.  The  area  field  specifies  a  unique  segment 
of  I/O  which  may  be  a  single  device  (1442  Card  Read-Punch,  1443 
Printer,  etc.)  or  a  group  of  several  units  (magnetic-tape  drives, 
serial  I/O  units,  contact  sense  units,  etc.).  (Area  00000  is  used  to 
address  system  devices  such  as  the  console  and  the  Interrupt  Mask 
Register.) 

Function  :—  IOCC[l](5:7} .  The  primary  I/O  functions  are  speci- 
fied by  the  3-bit  fimction  code  of  the  lOCC: 

000  Removes  an  I/O  device  from  on-line  status  and  places 
it  in  a  "free"  mode. 

001  Write 

Transfers  a  single  word  from  storage  to  an  I/O  unit. 
The  address  of  the  storage  location  is  provided  by  the 
Address  field  of  the  I/O  Control  Command. 

010  Read 

Transfers  a  single  word  from  an  I/O  unit  to  storage. 
The  address  of  the  storage  location  is  provided  by  the 
Address  field  of  the  I/O  Control  Command. 

Oil 


100 


101       Initialize  Write 

Initiates  a  Write  operation  on  a  device  or  unit  which 
will  subsequently  make  data  transfers  from  storage  via 
a  Pc. 

110  Initialize  Read 

Initiates  a  Read  operation  from  a  device  or  unit  which 
will  subsequently  make  data  transfers  to  storage  via  a 
Data  Channel. 

111  Sense  Device 

Reads  the  selected  device  status  word  into  the  Accu- 
mulator. A  Device  Status  Word  (DSW)  and  the  Process 
Interrupt  Status  Word  (PISW)  are  sensed  with  this 
instruction. 

If  Area  00000  is  specified,  the  Console  status  and 
Interval  Tinier  status  may  be  brought  into  the  Accu- 
mulator as  specified  by  a  unit  address  code  in  the 
Modifier  field. 

The  current  contents  of  the  Accumulator  are  destroyed  by  the 
execution  of  Sense  Interrupt  Level,  Sense  Device,  Initialize  Read, 
Initialize  Write,  Read,  or  Write. 

Modifier  .  =  IOCC[l]{8:15}.  This  8-bit  field  provides  additional 
detail  for  either  Function  or  Area.  For  example,  if  the  Area  spe- 
cifies a  disk  and  if  the  Function  specifies  Control  (100)  then  a 
particular  modifier  code  specifies  the  direction  of  the  Seek  opera- 
tion. In  this  case,  the  Modifier  serves  to  extend  the  function. 

If,  however,  the  Area  specifies  a  group  of  I/O  devices,  and  if 
the  Function  specifies  Write  (001),  then  the  particular  unit  address 
is  specified  bv  the  modifier. 

Address  :  =  IOCC[0](0:15}.  The  meaning  prescribed  for  this  16-bit 
field  is  dependent  upon  the  Function  specified  by  this  I/O  Control 
Command: 

1    If  Function  is  Initialize  Write  (101)  or  Initialize  Read  (110), 
then  .4ddre.ss  specifies  the  starting  address  of  a  table  in 


Sense  Internipt  Level 
Directs  the  selected  I/O  device  to  make  its  status 
available  in  the  Accumulator  as  the  Interrupt  Level 
Status  Word  (ILSW). 

Control 

Causes  the  selected  device  to  interpret  the  address 
and/or  Modifier  of  the  lOCC  as  a  specific  control 
action.  Examples  are  feed  card  and  load  interrupt  mask 
register. 


storage  (an  I/O  block).  The  contents  of  this  table  are  data 
words  and  control  information. 

2  If  Function  is  Control  (100)  and  if,  for  example.  Area  speci- 
fies the  1443  Printer,  the  Address  may  specify  a  specific 
control  action. 

3  If  Function  is  Sense  (01 1  or  1 1 1),  the  Address  field  is  ignored. 
Instead,  an  increment  of  time  equivalent  to  a  memory  cvcle 
is  taken,  during  which  the  selected  I/O  device  or  Inter- 
rupt Level  places  its  status  word  in  the  accumulator. 


Chapter  33  |  The  IBM  1800  411 


4  If  Function  is  Write  (001)  or  Read  (010),  the  Address  speci- 
fies the  storage  location  of  the  data  word. 

XIO  execution  interpretation  process 

1  The  E.\  of  the  XIO  is  developed  in  the  accumulator  (A) 
and  routed  to  the  Storage  Address  Register  (SAR)  to  locate 
the  lOCC  (as  for  any  EA). 

2  Bit  position  15  of  SAR  is  forced  on  to  select  the  EA  +  1 
where  the  lOCC  Area,  Function,  and  Modifier  are  found. 

3  The  Area,  Function,  and  .Modifier  are  routed  through  the 
B  register  to  the  Out-bus  to  the  control  of  the  device  speci- 
fied by  the  Area. 

4  Bit  position  15  of  SAR  is  turned  off  to  allow  the  address 
portion  of  the  lOCC  word  to  be  transferred  from  the  Mp 
location  specified  bv  the  Effective  .\ddress  {EA)  to  the  B 
register. 

5  If  the  Function  is  an  Initialize  Read,  Initialize  Write,  or 
Control,  the  address  part  of  the  lOCC  is  routed  through 
the  B  register  to  the  Out-bus.  The  address  part  of  the 
Initialize  Read/Write  lOCC  goes  to  the  Channel  Address 
Register  (C.^R)  of  Pio.  If  the  Function  is  Read  or  Write,  the 
address  is  routed  from  the  B  register  through  the  .\  regis- 
ter to  the  SAR.  SAR  addresses  the  memory  location  to  or 
from  which  the  data  are  transmitted. 

Interval  timers 

Three  timers  are  provided  to  supplv  real-time  infonnation  to  the 
program.  They  are  in  core-storage  locations  0004  (Timer  A),  0005 
(Timer  B),  and  0006  (Timer  C).  Each  timer  is  incremented  ac- 
cording to  its  associated  or  permanent  time  base  and  can  be 
hardwired  to  be  0.125,  0.250,  0.5,  1.  2,  4,  8,  16,  .32,  64,  or  128 
milliseconds. 

The  timers  can  be  started  or  stopped  under  program  control. 
When  the  count  reaches  zero,  an  interrupt  is  requested  on  the 
level  assigned  to  the  timers. 

Interrupt 

The  interrupt  feature  provides  an  automatic  branch  from  the 
normal  program  sequence,  based  upon  an  external  condition.  A 
maximum  of  24  external  interrupt  levels  (groups)  are  available, 
arranged  in  order  of  prioritv.  Twelve  external  internipt  levels  are 
standard.  Each  internipt  level  has  a  unique  core-storage  address 
assigned  to  it.  Several  devices  mav  be  connected  to  a  single  inter- 
rupt level,  and  program  polling  can  be  used  to  differentiate  the 
possible  signals  causing  the  interrupt.  The  Interrupt  Level  Status 
Word,  ILSW,  is  used  to  identifv  the  specific  condition  causing  its 
interrupt  level  to  request  service. 


Internal  interrupt.  When  any  one  of  the  following  error  conditions 
occur,  there  is  an  internal  interrupt  in  Pc:  an  invalid  op  code; 
a  Mp  parity  error  (an  even  number  of  bits);  a  storage-protect 
violation;  and  Channel  .\ddress  Register  check  error.  The  internal 
interrupt  takes  priority  over  all  external  interrupts  and  cannot  be 
masked. 

A  mask  register  exists  for  the  masking  and  unmasking  of  inter- 
rupt levels.  An  interrupt  level  that  is  masked  cannot  initiate  a 
request  for  service  until  it  has  been  unmasked. 

Device  status  word  (DSW).  DSW  indicators  usually  fall  into  three 
general  categories: 

1  Error  or  exception  interrupt  conditions 

2  Normal  data  or  service-required  internipts 

3  Routine  status  conditions 

Process  interrupt  status  word  indicators  (P/.SW).  The  PISW  indi- 
cators are  physically  located  in  Pc  and  are  turned  on  by  events 
external  to  the  computer,  e.g.,  contact  closures  or  voltage  shifts. 

10  processors' 

The  Pc  initializes  each  Pio  with  an  XIO  instniction.  The  Pio  has 
prioritv  to  the  extent  that,  when  the  I/O  device  is  ready  to  send 
or  receive  a  data  word,  the  Pc  is  stopped  while  the  word  transfers 
to  or  from  core  storage.  Pc  data  and  conditions  are  undisturbed 
except  for  the  memor\'  locations  that  receive  data  from  an  input 
device. 

I/O  devices  that  are  to  be  operated  concurrentlv  must  be  on 
separate  Pio's. 

The  XIO  instruction  for  a  Pio  specifies  an  I/O  Control  Com- 
mand (lOCC)  with  a  fimction  of  Initialize  Read  or  Initialize  Write. 
However,  even  though  a  device  operates  with  a  Pio,  the  XIO 
instructions  in  Pc  are  used  to  sense  device  status  and  for  control. 

Registers 

Channel  address  register.  The  Channel  Address  Register  (CAR) 
is  a  16-bit  register  used  to  store  the  Mp  address  of  the  next  word 
that  will  be  addressed  by  the  Pio.  Each  Pio  has  a  CAR.  Pio  and 
its  associated  C.\R  are  selected  when  their  assigned  I/O  device 
is  selected  h\  the  .\rea  Code  and  Modifier  of  an  lOCC  word. 
CAR  is  incremented  by  1  after  each  transfer  of  its  contents 
to  CAB. 

'IBM  name:  Data  Channel  (DC). 


412  Part  5  |  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


Channel  address  buffer.  A  common  Channel  Address  Buffer  (CAB) 
is  used  by  all  Channel  Address  Registers  to  address  Mp.  When  a 
cycle  steal  request  occurs,  the  CAR  for  the  requesting  Pio  is 
transferred  into  the  Channel  Address  Buffer. 

Channel-address-register  check  bit.  Channel  Address  Register 
(CAR)  checking  is  provided  to  ensure  that  the  first  word  addressed 
by  a  selected  CAR  is  the  first  word  of  the  correct  data  table.  Thus 
the  check  determines  if  a  Pc  program  has  set  up  the  Pio  program 
correctly.^  A  CAR  check  is  made  for  all  devices  after  the  address 
from  the  lOCC  word  is  transferred  to  the  selected  CAR.  A  bit- 
by-bit  comparison  is  made  between  the  contents  of  the  selected 
CAR  and  the  contents  of  the  B  register.  If  any  of  the  corresponding 
bits  are  not  equal,  a  CAR  check  error  has  occurred.  This  CAR 
check  error  terminates  the  Pio  task  and  initiates  an  internal  inter- 
rupt. 

Word  count  register.  A  Word  Count  Register  is  provided  in  each 
Pio.  The  Word  Count  Register  is  loaded  with  the  contents  of  the 
word-count  portion  of  the  data  table,  (2:15).  This  register  is 
decremented  each  time  a  data  word  is  transferred  from  (to)  the 
data  table. 

Scan  control  register.  A  Scan  Control  Register  is  provided  in  each 
Pio  that  has  chaining  ability.  Scan  Control  register  bits  are  stored 
in  the  first  word  of  the  first  data  table  (bit  positions  0  and  1)  and 
in  the  second  word  (bit  positions  0  and  1)  of  the  second  data 
table  and  all  subsequent  data  tables  in  a  chain. 

The  Scan  Control  Register  controls  the  I/O  device  and  the  Pio 
operation  at  the  end  of  the  data  table  as  follows;  single  scan  of 
data  table  and  stop  with  an  interrupt;  single  scan  of  data  table 
and  stop  (no  interrupt);  continuous  scan  of  this  data  table  or  a 
different  data  table  with  an  interrupt  at  the  end  of  this  table;  and 
continuous  scan  of  this  data  table  or  a  different  data  table  with 
no  interrupt. 

The  lO  processor  program  operation 

The  sequence  of  steps  for  a  Pio  program  is  given  below.  The 
memory  map  or  format  of  the  program  is  shown  in  Fig.  7. 

1  Pc  issues  an  XIO  instruction  which  references  the  lOCC 
word  and  initializes  Pio. 

2  The  Area  Code  and  Modifier  of  the  lOCC  select  the  I/O 
device.  Function  specifies  the  type  of  operation  (Initialize 
Read  or  Initialize  Write,  etc.). 

'Not  a  completely  arbitrary  program  fault  to  check,  since  processors  are  in- 
volved. 


.3    a    The  address  portion  of  the  lOCC  word  is  stored  in  CAR 
for  the  selected  Data  Channel  and  I/O  device. 

b    A  CAR  check  is  made  between  the  selected  CAR  and 
the  B  register. 

4  A  cycle  steal  is  requested  by  Pio;  CAR  transfers  to  CAB. 

5  CAB  addresses  core  storage  for  the  first  word  of  the  data 
table  while  CAR  is  being  incremented  bv  1. 

6  The  first  word  of  the  data  table  contains 

a    Scan  Control  bits  (bit  positions  0  and  1) 

b    Word  Count  (bit  position  2  to  15) 

These  are  transferred  to  their  respective  registers  in  the  I/O 

device.  This  is  the  end  of  the  first  cvcle  steal. 

7  When  another  cycle-steal  request  from  Pio  occurs,  CAR, 
which  was  incremented  in  step  5,  now  transfers  the  next 
higher  address  to  CAB.  C,\B  then  addresses  core  storage 
while  CAR  is  being  incremented. 

8  The  first  data  word  is  transferred  to  or  from  the  I/O  device 
via  the  B  register  and  Data  Channel.  The  Word  Count  Reg- 
ister in  the  I/O  device  is  decremented  by  1.  This  is  the 
end  of  the  second  cycle-steal  cycle. 

Steps  7  and  8  now  continue  on  a  cycle-steal  basis;  that  is,  thev 
occur  as  the  I/O  device  requests  data  transfers.  The  CAR  is 
incremented  with  each  data  transfer  and  the  WCR  is  decremented. 
This  sequence  continues  until  the  last  data  word  of  the  data  table 
is  transferred.  The  last  word  transfer  is  sensed  by  the  WCR  reach- 
ing zero  or  through  some  indicator  in  the  device.  If  the  device 
does  not  have  chaining  abilitv,  no  more  demands  for  data  transfer 
are  made  until  the  device  is  reinitialized  with  another  XIO  instruc- 
tion. 

Chaining.  These  steps  are  for  the  second  and  all  subsequent  data 
tables.  See  above  for  steps  1  through  8. 

9  The  contents  of  the  word  following  the  last  data  word  in 
the  first  data  table  are  transferred  to  CAR.  This  word  must 
contain  the  address  of  the  next  data  table. 

10  a    When  the  next  cycle  is  requested,  CAR  is  transferred 

to  CAB  to  address  core  storage.  The  contents  of  the 
first  word  of  the  next  data  table  is  transferred  to  the 
B  register.  This  word  must  contain  the  address  of  itself. 

10  b    CAR  check  is  performed  and  CAR  is  incremented 

by  I. 

11  When  the  next  cycle  steal  is  requested,  CAR  is  transferred 
to  CAB  and  CAB  addresses  Mp.  The  Scan-control  bits  and 
Word-count  bits  are  transferred  from  the  second  word  of 


Chapter  33  |  The  IBM  1800  413 


0 

15 

0 

15 

0  0  0  0  1 

1 

0  0 

00000000 

Address  =  1024 

X  10  Instruction 
5C  Word  Count 


1000 
1001 
1002 


1  1 


00000000010110 


F  irst  Data  Word 


-[ 


Word  Count  =  22 
SC  -  Continuous  with 
No  Interrupt 


2000 
2001 
2002 


00  00011111010000 


00  00000000110110 


First  Dota  Word 


CAR  Check  Word=2000 

>Word  Count  =  54 
SC  =  Single  Scan 
and  Stop  with  on 
Interrupt 


1022 

1023 

1024 
1025 


Lost  Doto  Word 


000001  1  1  1  1010000 


0000001111101000 


Fun 


Modifier 


Address  of  Next  Table 
2000 


lOCC 
Address 


2055 


1000 


Lost  Data  Word 


b. 


Fig.  7.  IBM  1800  data-channel  tables  for  chaining  memory  maps,  (a)  First  data  table;  (b)  second  data  table.  (Courtesy  of  International  Business  Machines 
Corporation.) 


the  data  table  to  their  respective  registers.  CAR  is  incre- 
mented bv  1. 

12  Data  are  transferred  to  (from)  the  I  O  device  on  a  cycle- 
steal  basis  via  the  B  register  and  the  Data  Channel.  CAB 
addresses  core  storage  to  transfer  a  data  word  to  the  B 
register.  Each  time  CAB  addresses  core  storage,  C.\R  is 
incremented  by  1.  When  the  ne.xt  cycle-steal  request 
occurs,  CAR  is  transferred  to  C.\B.  The  Word-count  Reg- 
ister is  decremented  for  each  word  transferred. 

13  When  the  last  data  character  is  transferred  (word  count 
is  decremented  to  zero),  operation  will  continue  as  speci- 
fied bv  the  Scan  Control  Register.  (See  above  section  for 
Scan-Control  Register.) 


Special  data  channek 

The  four  Pio  types  for  special  fimctions  are: 


1  .\nalog  input  (block  data  transfers,  and  comparisons  of 
analog  inputs  for  limits! 

2  Digital  input  output 
■3    .\nalog  output 

4    Digital  output 

Analog-input  data  channels.  Memorv  maps  (Fig.  8a  and  h)  illus- 
trate the  command  formats  interpreted  in  the  .Analog  Data  Chan- 
nel programs.  A  list  of  limit  values  is  placed  in  a  table  (Fig.  8a), 
and  each  analog  input  is  compared  with  the  limits.  The  operation 
sequence  is:  Read  a  specific  addressed  analog  voltage,  called  the 
multiplex'  point  (mpx);  compare  the  input  voltage  with  the  limits 
stored  in  the  table  following  the  analog  address  (the  limit  word 
contains  a  high  and  low  value  in  bits  <0:7>  and  <8:15),  respec- 

'The  IBM  multiplexor  is  an  S  which  allows  multiple  inputs  to  be  read 
into  the  T(.\nalog  to  Digital  Converter)  sequentially. 


414  Part  5  I  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


I    2    3    4    5    6    7    e    9  10  II    IZ  13  14  15 

L  K  Multiplevor  Address 


1  0 

ADDRESS  A 
 1    1    1  1 

F  irst  Mpx  Point 

Location 

LIMIT  WORD 

Limits  Not  Used 

1  1 

ADDRESS  B 

Second  Mpx  Point 

3001 
3002 

LIMIT  WORD 

Comparison  Is  Performed 

3119 
3120 

0  0 


ADDRESS  C 

Third  Mpx  Point 

1  1 

1   1    1    1  1 

ADDRESS  D 
>   1    1    1    1    1    1    1  1 

Fourth  Mpx  Point 

3122 

LIMIT  WORD 

Comporison  is  Performed 

3123 

1  0 

ADDRESS  E 

Fifth  Mpx  Point 

L  =  1,  Limit  Word  Follows 
K=  I,  Perform  Comporison 


This  word  contoins 
i  ts  own  oddress 
MPX  Address  (47) 


MPX  Addrc 


i  (821 


MPX  Address  jU) 


Storting  Toble  Addr.(3000) 
(Not  Used)  


Storting  Toble  Addr  ■  (300l| 
A  I  -Int.  WR 


Locatic 
2999 
3000 

3001 

301 1 
3012 


Word  Count  =  I  2 


Multiplex  Addr< 


Starting  Toble  Address' 30 1  5,i 


3201 
3202 


3203 
3204 


This  word  contains  Its  own  address 


SC 
(00) 


'.Voro  Co,- 

(119) 


ADC  ^olue  (47) 
ADC  Volue  (82) 


ADC  Volue  (14) 


3015 

3016 

3017 
3018 

3041 
3042 
3043 


Cor  Check  Word 
=  3015 


Word  Count  =  25 


Multiplex  Address 


Value  12 


Storting  Table  Address  (2999) 


A/1  -  Initialize  Read 


3403 
3404 


3521 
3522 


lOCC 


Slotting  lOCC 
3524 


SC 
(101 


Word  Count 
(119) 


ADC  Volue  (47) 
ADC  Value  (82) 


ADC  Volue  (14) 


Starting  Table  Addr. 
(3201) 


Storting  Table  Addr 
(3402) 


A/I  -  Int.  Rd. 


Fig.  8.  IBM  1800  data-channel  analog-Input  instruction  format  and  memory  maps,  (a)  Multiplexor  address  table 
with  limit  words  for  comparisons,  (b)  Data  table,  chained  sequential  control,  (c)  Multiplexor  address  table,  random 
addressing,  (d)  Analog-to-digltal  converter  storage  tables,  random  addressing  (used  with  a  second  data  channel). 


Chapter  33  |  The  IBM  1800  415 


Scan 
Control 

Word  Count  -  m  +  1 

Scan 
Control 

Word  Count  =  n  +  1 

Initial  Digital  Input  Group  Address 

D  or  A  Output  Address 

Data  1 

Data  1 

Data  2 

Dato  2 

Scan 
Control 


Word  Count  =  2rr 


Digital  Input  Group  Address] 


Data  1 


Digital  Input  Group  Address2 


Data  2 


Digital  Input  Group  Address^ 


Data  I 


Scon  Control        Word  Count  -  2n 


Initial  D  or  A  Output  Address 


Data. 


D  or  A  Output  Address 


Dota-i 


D  or  A  Output  Addresses 


Datoo 


d. 


Fig.  9.  IBM  1800  data  ctiannel  digital  or  analog-output  instruction  formats  and  memory  maps,  (a)  Digital  input, 
sequential;  (b)  digital  input,  random  addressing;  (c)  digital  or  analog  output,  sequential;  (d)  digital  or  analog  out- 
put, random  addressing.  (Courtesy  of  /nternafiona/  Business  Machines  Corporat/on.) 


lively);  and  if  the  analog-input  value  lies  outside  the  limit  range, 
initiate  an  interrupt. 

Figi^ire  8b  describes  a  second  use  of  this  data  channel.  Pio 
accepts  a  sequence  of  analog  inputs  and  packs  them  into  a  table 
following  the  address  initiation  instruction.  The  analog  inputs  from 
the  T's  are  either  fi.xed  or  selected  in  a  cyclic  fashion  from  a 
Multiplexor. 

Two  Pio's  can  be  used  concurrently:  One  Pio  controls  the  input 
from  a  series  of  analog-input  addresses  (Fig.  8c);  the  second  Pio 
packs  the  corresponding  analog  values  in  a  second  table  (Fig.  8rf). 


Digiliil-input  data  channels.  Digital  parameters  or  events  can  be 
read  into  Mp  under  the  control  of  a  Digital-input  Data  Channel. 
The  memory  map  (Fig.  9a)  shows  the  control  format  for  selecting 
and  inputting  a  block  or  sequence  of  external  data.  The  memory 
map  (Fig.  9b)  illustrates  a  more  general  ability  to  address  inputs 
at  random  and  read  them  into  succeeding  Mp  locations. 

Digital-  and  analog-output  data  channels.  Memory  maps  (Fig.  9c 
and  d)  show  the  program  format  used  by  the  Digital-  or  .\nalog- 
output  Data  Channels.  These  channels  output  selected  data  points 


416  Part  5  I  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


to  external  analog  or  digital  K"s.  This  Pio  is  similar  to  the  Digital- 
input  Data  Channel. 

Conclusions 

We  have  tried  to  show  a  typical,  third-generation  computer  used 
for  process  control.  Many  of  the  facilities  the  1800  possesses  are 


general.  The  Pio's  are  rather  special,  designed  to  monitor  and 
control  a  process,  independent  of  Pc.  Although  the  Pio's  are 
powerful  (by  providing  parallel  data  transmission),  their  use,  like 
other  multiprocessing  systems,  is  nontrivial.  The  Pc  ISP  is  fairly 
straightforward,  and  one  should  write  a  program  using  it  to  ap- 
preciate its  simplicity. 


Chapter  33  |  The  IBM  1800  417 


APPENDIX  1    THE  IBM  1800  ISP  DESCRIPTION 


A    endix  1 
ppen 

IBM  1800  ISP  Description 

Tc  State 

A<0 ;  1 5> 

Accumulator 

Q<0 : 1 5> 

Accunulator  Extension  for  multiplier,  Quotient  and  double 
lenffth 

l<0: 1 5> 

Instruction  Location  Counter 

XR[1 :3]<0:15> 

Index  Registers 

Ov 

Overfla-^  Indicator 

C 

Carru  Indicator 

Run 

denotes  running  coffjputer 

Mp  State 

M[0:FFFF|^]<P 

S,0: 15> 

Mp  lyith  Parity  and  Protect  bits 

Pc  Console  State 

Check  Stop  5w' 

tch 

Pc  stops  if  storage  protect  violation  occurs 

W5PB  Swi  tch 

Write  Storage  Protect  Bits;  enables  the  writing  of  bits  in 
a  word 

SPV  Indicator 

Storage  Protect  '■"                  -'cator:  set  to  1  if  a  memory 

Instruction  Format 

instruction/! [0: 1 ]<0 ; ) 5> 

op<0:'l> 

=  i[0]<n:l4> 

operation  code 

shop-iO:7> 

=  opal  [0]<S,8.9> 

shift  operation  code  count 

f 

=  i[0]<S> 

format;  specifies  a  1  or  2  word  instruction 

t<0:  1> 

=  i  [0]<!6;7> 

tag:  index  register  specification 

d<8:  15> 

>  i[0>-8:l5> 

d5gn<0: 15> 

=  signuextend  (d<8Xld<l:  1  5>) 

a<0:  15> 

=  i  [1  ]<D:15> 

address 

i  a 

=   i  [0]<8> 

bo 

=  ![0M> 

branch  out  btt 

cond<D:5> 

=  i  [0]<10:15> 

condi-ttons  for  test 

Effective  Address 

Calculation  Process 

z<0:15>:=  ( 

effective  address 

(t  =  0)a  ^ 

f            -  (dsqn  *  I): 

1  wordy  relative 

(t  4  o)a  ^ 

f             -.  (dsqn  +  XR[t  ])  : 

2  word,  relative,  indexed 

(t   =   0)  A 

F  A  ^  i  a  ^.  a  ; 

2  word,  direct 

(t      0)  A 

F  A  ^  ia--  (a  *  XR[t]); 

2  word,  direct,  indexed 

(t  =  0)  A 

F  A  ia     -  M[a]; 

2  word,  indirect 

(t   J*   0)  A 

F  A  ia     -  (H[a  +  XR[t  ]])) 

2  word,  indirect,  indexed 

z'<D:15>  ;=  ( 

f   -  (dsgn  +   1  )  ; 
F  A  ^  i  a  —  a  ; 
F  A  ia  -M[a]) 

effective  address  for  index  register  instructions 

418  Part  5  I  The  PMS  level  Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


APPENDIX  1    THE  IBM  1800  ISP  DESCRIPTION  (Continued) 


zd<0:15>  :=   (-iz<15>         +  I; 

process  for  locating  second  operand  for  double  length 

z<15>  ^z) 

xi<0:15>  :=        f  ->dsg 

index  ir^rement 

f   A  i 

a  -*a; 

f  A  ia 

-»M[a  ]) 

s<0:5> 

:=  ( 

shift  count  calculation 

(t  = 

0)  -,d<10:15> 

(t  ^^  0)  ^XR[t  ]<)0: 

15>) 

Instruction  Interpretation  Process 

Run  "  (i  ns  t  ruct  Ion  [0 : 1  [ 

<-M[l  :  1  +  1  ];  next 

fetch 

f  -  d    ^1  +  1);  f   ^(1    >-l  +  2);  next 

2  or  2  word  instruction 

I  ns t  rue t  i  onL^execut  ion  ) 

execute 

Instruction 

Set  one'  Instruction  Execution  Process 

1  nstruct 

'oHijexecution 

:=  ( 

Load  and  AritJvnetia 

LD  ( 

=  op  =  1 1000) 

^  (A  <-M[z]); 

load  acaumitator 

LDD  ( 

=  op  =  1 1001) 

^  (AdQ  <-  M[2]DM[zd])  ; 

double  load 

STO  ( 

=  op  =  1  1010) 

-  (M[z]  <-A); 

store  accuimlator 

STD  ( 

■=  op  =  11011) 

->  (M[2]aM[zd]  ^  ADQ)  ; 

double  store 

A  ( 

=  op  =  10000) 

^  (Ov.CoA  ^A  +  M[2])  ; 

add 

AD  { 

=  op  =  10001) 

^  (Ov.COADQ  ^ADQ  +  M[z]DM[zd]); 

double  add 

S  ( 

=  op  =  10010) 

-,  (Ov.CcA  ^A  -  M[z]); 

subtract 

SD  ( 

=  op  =  loon) 

->  (Ov.CqAdC!  ^ADQ  -  M[z]DM[zd]); 

double  subtract 

M  ( 

=  op  =  10100) 

^  (AdQ  ^A  X  M[z]); 

TTTUltipZlf 

D  ( 

=  op  =  10101) 

^  (Ov,Q  ^AdQ  /  H[z]; 

divide 

A  *-Aa}  mod  M[z])  ; 

Logical  instructions 

AND  ( 

=  op  =  1 1 100) 

-<  (A           A  M[z  ])  ; 

logical  and 

OR  ( 

=  op  =  11101) 

_  (A  <-A  V  M[z]); 

logical  or 

EOR  ( 

=  op  =  11110) 

-»  (A  -A  ®  M[z]); 

logical  exclusive  or 

Compare 

CMP  (: 

=  op  =  10110)  -   ((A  <  H[z])  -   (1  -  1  +  1); 

compare 

(A  =  M[zl)  -   (1  -  1  +  2)); 

DCM  (: 

=  op  =  10111)  -   ((ADQ<  M[z]CM[zd])  -  (1  -  1  +  1); 

double  compare 

(ACQ  =  M[z]CM[zd])  -  (1  ^  1  +  2)); 

Shifts 

SLA  ( 

=  shop  =  OOOIOOOOOO)   -  ( 

shift  left  logical 

A  <-A  X  2^   [logical];  C  <-A<s-l>); 

SLT  ( 

=  shop  =  OOOlOODolO)   ->  ( 

shift  double  left  logical 

AoC!  ^Adl  X  2^    [logical  1;  C  ^A<s-1>): 

5RA  ( 

=  shop  =  OOOllcOcSO)   ->  (A  .-A  /  2^  [logical]); 

shift  right  logical 

SRT  ( 

=  shop  =  OOOllrDDlO)       (ADQ  t-ADQ  /  2^); 

shift  right  A  and  0 

RTE  ( 

=  shop  =  OOOlloODll)   -  (AOQ  -ADQ  /  2^  [rotate]): 

rotate  right  A  and  0 

SLCA(- 

=  shop  =  OOOIOdOOOI  )   ->  ( 

shift  left  and  count  A 

APPENDIX  1    THE  IBM  1800  ISP  DESCRIPTION  (Continued) 


(t  =  0)  ->  (A          X  2^;  C  ^A<s-1>); 

( t      0)  -»  (A  «-  normal  i  ze  (A) ; 

CdXR[  t]<IO:  1  5>  •-  norma  1  i  ze^exponen  t  (A)  ; 

XR[t]<8,9>  <-0)); 

SLC   (:=  shop  »  OOOlOnODll)  ->(-,(  (s  =  0)  V  A<0>)  -.  ( 

shift  left  and  count 

(t  =  0)  ^  (taQ  <-taC!  X  2^;  C  ^A<s-I>); 

(t  J*  0)  ->  (AoQ  —  normal  i  ze  (AaQ)  ; 

CdXR[  t]  <-  normal  i  ze^exponent  (AoQ) ) ) )  ; 

LDX   (:=  op  =  01100)  ->  ((t  =  0)       (1  <- z  '  )  ; 

load  index  or  instruction  counter 

(t  J*  0)       (XR[t]  ^  z')); 

STX   (:=  op  =  01101)  ^  ((t  =  0)  -,  {M[z']  ^  1): 

store  index  or  instruction  counter 

(t      0)  ^  (M[z']  .-  XR[t])); 

STS   (:=  op  =  OOlOl)  ^  ( 

store  status 

(f  /\  bo)  _,M[z]<P>^  cond<l5>; 

-,bo  ->  (M[  z]<8:  IS>  <-  OOOOCdCaOv;   CaCv  t-  00)  )  ; 

LDS   {:=  i[0]  =  OOlOCDCaOOaOOOOOO**  )  -»  (C  i[0]<l')>; 

load  status 

Ov  .-  i[0]<15>); 

BSC   (:=   (op  =  01001)  A  ^  i<a>)  -  ( 

branch  or  skip  on  condition 

(     sk  ip^condi  t  ion  A  -^  f )  -»  ( 1  t-  1  +  1  )  ; 

(  -iSkI  p^condi  tionA  f)-»(l  *-  z) : 

d<lS>  ->  Ov  ^0)  ; 

sk  i  PijCond  i  t  i  on   :  =  ( 

(-lOv  A  d<l5>)  V 

overflew  off 

(-,  C  A  d<lll>)  V 

carry  off 

(A<15>  A  d<13>)  V 

Accumulator  even 

((A  >  0)  A  d<12>)  V 

Accumulator  greater  than  zero 

(A<0>  A  d<l 1>)  V 

Accumulator  negative 

({A=0)  A  d<10>)) 

Accumulator  zero 

BOSC   (:=(op  =  01001)  A  i<3>)  ( 

branch  out  of  interrupts 

(ski  p„cond  itionA-if)  ->(l  <-!  +  1;   Interrupt  <-  1 ) ; 

(-1  ski  PujCond  itionA  f)  -*(!  "-z;   interrupt  *"1); 

d<15^  ^  (Ov  -O); 

B51   (:=  op  =  01000)  ->  ( 

branch  and  store  instruction  register 

-,f  ^  M   ^z  +  1  :  M[z]  t-  1); 

f  -  (d<15>  -  Ov  -  0) ; 

-iSkipj:ondition  ->  ( 1  ^z  +  1;  ^'[z]  ^0); 

MDX  (:=  op  =  nillO)  ( 

modify  index  and  skip 

(t=olA-,f-^(l<-l  +  dsgn)  ; 

local  branch 

(t  =  0)  A  f  ^(M[a]  -M[a]  +  dsgn; 

(Msuni=0)  V  (M[0]<0>  ffi  M5unv;0>)  -  ( 1  -  1  +  1 ) ) ; 

result  zero  or  sign  change 

Msum,0:15>  :=  (M[a]  +  dsgn) 

(t  ^  0)   ^(XR[t]  ^XR[t]  +  x!; 

(Xsum=0)   V  (XR[t]-0^>  ®Xsum<<D>)  -  (1   -  1  +  1))); 

result  zero  or  sign  change 

X5um-,0:15>  :=   (XR[t]  +  dsgn) 

Uait   (:=  i  =  ^OOOj^)  -»  (1  <-  1   -  1); 

420  Part  5  |  The  PMS  level  Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


APPENDIX  1    THE  IBM  1800  ISP  DESCRIPTION  (Continued) 


10  CoTiiT'oZ  lyiS'tinj.O'ti'OYii 

XIO  (:=  op  =  00001)  — *  { 

ExGOZits  X/Oj  not  dGj'vnGu 

inrrrn-ii  .—  Mr?  inMf  tHI  • 

\ 
1 

snd  Ins  true  t'L-0ni_^xs(2'u.ti,on 

10  Instruction  Format: 

10  Address<0: 15>  := 

lOCCCO] 

address  if  10  data 

10  Device  or  krea<i:k>     :  = 

I0CC[1 ]<0:h> 

io  device  name 

10  Function<5:7>  := 

i0CC[l ]<5:7> 

10  Modif ier<8:25>  := 

I0CC[1 ]<8:15> 

device  function  details 

Device  mode  off  line  := 

(10  Function  =  0) 

Device  mode  write  := 

(10  Funct Ion  =  1 ) 

Device  mode  read  := 

(  10  Function  =  2) 

Device  mode  sense  Interrupt   level    :=  (10  Funct 

on  =  3) 

Device  mode  control  := 

(10  Function  =  ^t) 

Device  mode  initialize 

write   :=   (10  Function  = 

5) 

Device  mode  initialize 

read     :=   (10  Function  = 

6) 

Device  mode  sense  := 

( 10  Function  =  7) 

Chapter  34 

The  engineering  design  of  the  Stretch 
computer^ 

Erich  Block 

Summary  The  Stretch  computer  is  an  advanced  scientific  computer  with 
variable  facilities  for  floating-point,  fixed-point,  and  variable-field-length 
arithmetic  and  data-handling  facilities. 

The  performance  goal  of  1(K)  x  "04  speed  is  achieved  by  high-speed 
circuits,  multiplexing,  and  simultaneous-operation  technique  of  instruction 
and  data-fetching,  as  well  as  overlap  within  the  execution  units.  This 
massive  overlap  and  multiplexing  results  in  complicated  recovery  routines 
between  the  look-ahead  and  instruction  units.  These  units  are  described 
in  detail,  a-s  are  the  arithmetic  units  and  significant  algorithms  used  in  the 
floating-point  arithmetic. 

\  flexible  set  of  circuits  using  a  current-switching  technique  with 
overriding-level  facilitv  is  described,  as  well  as  the  packaging  of  circuits 
on  printed  cards.  The  frame  and  gate  concept  is  also  showii.  Performance 
figures  and  hardware  count  illustrate  the  size,  complexity,  and  performance 
of  the  system. 

Introduction 

The  Stretch  computer  [Duinveil.  1956]  project  was  started  in  order 
to  achieve  two  orders  of  magnitude  of  improvement  in  perform- 
ance over  the  then  existing  704.  .-Mthough  this  computer,  like  the 
704,  is  aimed  at  scientific  problems  such  as  reactor  design,  hydro- 
dynamics problems,  partial  difTerential  equation  etc.,  its  instruc- 
tion set  and  organization  are  such  that  it  can  handle  with  ease 
data-processing  problems  normally  associated  with  commercial 
applications,  such  as  processing  of  alphanumeric  fields,  sorting,  and 
decimal  arithmetic. 

In  order  to  achieve  the  stated  goat  of  performance,  all  factors 
that  go  into  the  computer  design  must  contribute  towards  the 
performance  goal;  this  includes  the  instruction  set  [Buchholz. 
1958],  the  internal  system  organization,  the  data  and  instruction 
word  length,  and  auxiliary  features  such  as  status-monitoring 
devices,  the  circuits,  packaging,  and  component  technology.  No 
one  of  them  by  itself  can  give  this  hundred-fold  increase  in  speed; 
only  b\  the  combining  and  interacting  of  these  contributing 
factors  can  this  performance  be  obtained. 

'Prof.  EJCC.  pp.  48-59,  1959. 


This  paper  reviews  the  engineering  design  of  the  Stretch  System 
with  primar\'  concentration  on  the  central  computer  as  the  main 
contributor  to  performance.  In  it,  these  new  techniques,  devices, 
and  instnictions  have  been  pushed  to  the  limit  set  bv  the  present 
technology  and,  therefore,  its  analysis  will  convey  best  the  prob- 
lems encountered  and  the  solutions  employed. 


The  Stretch  system 

Early  in  the  system  design,  it  appeared  evident  that  a  six-fold 
improvement  in  memory  performance  and  a  ten-fold  improvement 
in  basic  circuit  speed  over  the  704  was  the  best  one  could  achieve. 
To  meet  the  proposed  performance  criteria,  the  system  had  to  be 
organized  in  such  a  way  that  it  took  advantage  of  every  possible 
overlap  of  systems  hmction,  multiplexing  of  the  major  portion  of 
the  system,  processing  of  operations  simultaneously,  and  anticipa- 
tion of  occurrences,  wherever  possible.  The  system  had  to  be 
capable  of  making  assumptions  based  on  the  probability  that 
certain  events  might  occur,  and  means  had  to  be  provided  to 
retrace  the  steps  when  the  assumption  proved  to  be  wrong. 

This  simultaneity  and  multiplexing  of  operations  reflects  itself 
in  the  Stretch  System  at  all  levels,  from  overall  systems  organiza- 
tion to  the  cycle  of  specific  instructions.  In  the  following  descrip- 
tion, this  will  be  discussed  in  more  detail. 

If  one  considers  the  Stretch  System  (Fig.  I)  from  an  overall 
point  of  view  it  becomes  apparent  that  the  major  parts  of  the 
system  can  operate  simultaneously: 

a  The  2-fisec,  16..384-word  core  memories  are  self-contained, 
with  their  own  clocks,  addressing  circuits,  data  registers  and 
checking  circuits.  The  memories  themselves  are  interleaved 
so  that  the  first  two  memories  have  their  addresses  distrib- 
uted modulo  2  and  the  other  four  are  interleaved  modulo 
4.  The  morfi//o-2-interleaved  memories  are  used  primarily 
for  instrviction  storage;  since,  for  high-performance  instnic- 
tions, halfword  formats  are  used,  the  average  rate  of  ob- 
taining instructions  is  one  per  ^2  /isec.  Similarly,  a  0.5-j[isec 


422  Part  5  j  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


INSTRUCTION  MEMORIES  OPERAND  MEMORIES 

(MOD  2  INTERLEAVED)  /  (MOD  4  INTERLEAVED) 


t  ,     ,  t  /  t  t  t  .  , 

;C  CORE       2/x  SEC  CORE     /  2;j.  SEC  COReI     2^  SEC  CORE      |2^SECC0RE      2/j.  SEC 


2p 

SEC  CORE 

2/x  SEC  COREI  / 

2u  SEC  COREI 

Zu  SEC  CORE 

IZu.  SEC  CORE 

2m 

SEC  CORE 

MEMORY 

MEMORY  / 

MEMORY 

MEMORY 

MEMORY 

MEMORY 

(16K) 

(16K)  / 

(16K) 

(16K) 

(16K) 

(I6K) 

/ 

MEMORY 

in' 

BUS 

MEMORY  OUT  BUS 


DISK  SYNCH 
UNIT 


I/O  EXCHANGE 


MEMORY 
BUS 


CENTRAL 
COMPUTER 


CONSOLE 
ADAPTER 

READER 
ADAPTER 


DISK 
UNIT 

4x10^  WORDS 


CONSOLE 


t 


PRINTER 
ADAPTER 


PUNCH 
ADAPTER 


TAPE 
ADAPTER 


TAPE 
ADAPTER 


TAPE 
ADAPTER 


729- IE 
TAPE 


Fig.  1.  The  Stretch  system. 


data-word  rate  is  achieved  by  the  use  of  four  modulo-4 
organized  memories.  The  addressing  of  the  memories  and 
the  transfer  of  information  from  and  to  the  memories  by 
a  memory  bus  permits  new  addresses,  information,  or  both 
to  pass  through  the  bus  every  200  m/isec. 

b  The  simultaneously-operating  Input/Output  units  are 
linked  with  the  memories  and  the  computer  through  the 
Exchange,  which,  after  initial  instruction  by  the  computer, 
coordinates  the  starting  of  the  I/O  equipment,  the  checking 
and  error-correction  of  the  information,  the  arrangement 
of  the  information  into  memory  words,  and  the  fetching  and 
storing  of  the  information  from  and  to  memory.  All  these 
functions  are  executed  without  the  use  of  the  computer, 
so  it  can  in  the  meantime  continue  its  data  processing  and 
computation. 

c  The  central  computer  processes  and  executes  the  stored 
program.  Here,  now,  the  simultaneity  and  multiplexing  of 
functions  has  reached  its  ultimate. 


Before  discussing  the  computer  organization,  a  few  general 
itures  must  be  mentioned  for  completeness: 

a  Word  length:  64  bits  plus  eight  bits  for  parity  checks  and 
error-correction  codes. 

b  Memory  capacity  and  addressing:  A  possible  256,000  words 
can  be  randomly  addressed.  These  storage  positions  are  all 
in  external  memory,  except  for  the  32  first  addresses.  These 
positions  consist  of  the  internal  registers  (accumulators,  time 
clocks,  index  registers). 

c  The  instructions  are  single-address  instructions  with  the 
exception  of  a  number  of  special  codes  that  imply  the 
second  address  explicitly. 

The  instruction  set  (Fig.  2)  is  generalized  and  contains  a 
full  set  for  single-  and  double-precision  floating-point  arith- 
metic, and  a  full  set  for  variable-field-length  integer  arith- 
metic (binary  and  decimal).  It  also  has  a  generalized  set  for 
index  modification  and  a  branching  set,  as  well  as  a  set  of 


Chapter  34  |  The  engineering  design  of  the  Stretch  computer  423 


I/O  instructions.  All  told,  765  different  types  of  instructions 
are  used  in  the  system. 

d  The  instruction  format  (Fig.  .3)  makes  use  of  both  half  and 
full  words;  half  words  accommodate  indexing  and  floating- 
point instructions  (for  optimiuu  performance  these  two  sets 
of  instmctions  use  a  rigid  format),  and  full-word  formats 
are  used  by  the  variable-field-length  instructions.  Notice 
that  the  latter  specifies  the  operand  field  by  the  address  of 
its  left-most  bit,  the  length  of  the  field,  and  the  byte'  size, 
as  well  as  the  starting  point  (offset)  of  the  implied  operand 

'  Byte:  a  generic  term  to  denote  the  number  of  bits  to  be  operated  on  as 
a  unit  by  a  variable-tield-length  instruction. 


(accumulator).  Both  halves  of  the  word  are  independently 
indexable. 

A  general  monitoring  device  used  for  important  status 
triggers  is  called  the  Interrupt  [Brooks,  1957]  System.  This 
system  monitors  the  flip-flops  which  reflect  internal  mal- 
functions, result  significance  (exponent  range,  mantissa  zero, 
overflow,  underflow),  program  errors  (illegal  instruction, 
protected  memory  area),  and  input/output  conditions  (unit 
not  ready,  etc.).  The  status  of  these  flip-flops  can  cause  a 
break  in  the  normal  progression  of  the  stored  program  for 
fix-up  purposes.  Their  status  is  automatically  interrogated 
at  all  times. 


COMPUTER  VOCABULARY 


INSTRUCTION 
CATEGORY 

CLASS 

MODIFIER 

EXAMPLES 

NUMBER 
OF  INSTR 

VARIABLE  FIELD 
LENGTH  ARITHMETIC 

BINARY  DECIMAL 

SIGNED 
UNSIGNED 
SAME  SIGN 
NEGATIVE  SIGN 

ADD  (TO  MEMORY) 
LOAD /STORE 
MPY 
DIVIDE 

CUMULATIVE  MPY 

280 

RADIX  CONVERSION 

BIN/DEC 

32 

LOGIC  CONNECTS 

16  LOGIC  STATEMENT 

48 

FLOATING  POINT 
ARITHMETIC 

NORMALIZED 
UNNORMALIZED 

SAME  SIGN 
OPPOSITE  SIGN 
NEGATIVE  SIGN 
NOISY  MODE 

ADD  (SINGLE  S  DOUBLE  ) 
LOAD/STORE 
MPY/(SINGLE  a  DOUBLE) 
DIV  (WITH  REMAINDER) 
INTERCHANGE  DIVIDE 
CUMULATIVE  MPY 
SQUARE  ROOT 

240 

INDEXING  ARITHMETIC 

DIRECT 

IMMEDIATE 

PROGRESSIVE 

43 

BRANCHES 

UNCONDITIONAL 
INDEXING 
INDICATOR  "1 
BIT  J 

IF  r  I 
I  0 

STORE  INST  CTR 

SETO 
LEAVE  BIT 
INVERT  BIT 

68 

TRANSMIT/SWAP 
I/O  INSTRUCTION 

24 

TOTAL 

735 

Fig.  2.  The  instruction  set. 


Part  5  I  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


DATA  FORMATS 


INTEGER 


FLOATING 
POINT 


 1  1  1  1  1  1  1  

BYTE  8    BYTE  7  BYTE  6  BYTE  5  BYTE  4  BYTE  3  BYTE  2  BYTE  1 

 1  I  1  I  I  I  I  


EXPONENT 


101112 


ECC 
PTY  BITS 


15  23  31  39         47  55 


63  71 

/FLAG 


MANTISSA  (FRACTION) 


+ 

1  1 

ECC 

PARITY 

59  63 


71 


INDEX 
WORD 


I/O 

CONTRl 
WORD 


VALUE 


m 


COUNT 


REFILL 


ECC 
PARITY 


23  28 


46 


63  71 


FLAG 


DATA  WORD  ADR 

I/O 
STATUS 

COUNT 

REFILL 

ECC 
PARITY 

18        25  28 


46 


63  71 


INTEGER 


FLOATING 
POINT 


INSTRUCTION  FORMATS 


WORD  ADDRESS 


BIT 
ADR 


1000 


IND 


FIELD 
LENGTH 


BYTE  SIZE 


-MEMORY  FIELD 
SIGNED 


ACC 
OFFSET 


OP 


18 


24  28    32  35 


41  44 


51 


PARITY 


ADDRESS 


^  r 


S    OP  10 


60  63 

BINARY 
DECIMAL 


71 


18 


28  31 


DIRECT 
INDEX 


1 

ADDRESS 

1 

J 

OP 

1 

19    23    28  31 


Fig.  3.  Data  word— and  instruction  word  formats. 


The  Stretch  computer 

If  one  considers  the  internal  organization  of  the  majority  of  com- 
puters that  have  been  produced  during  the  last  eight  years  (and 
the  704  is  a  case  in  point),  the  organization  looks  as  shown  in  Fig. 
4a.  There  is  a  sequential  flow  of  instructions  into  the  computer, 
and  after  due  processing  and  execution,  the  next  instniction  is 
called  from  memory.  Compare  this  with  Fig.  4b,  showing  the 


organization  of  Stretch,  where  two  instruction  words  and  four 
operands  can  be  fetched  simultaneously.  In  addition,  the  execution 
of  the  instniction  is  done  in  parallel  and  simultaneously  with  the 
described  fetching  functions. 

All  the  units  of  the  computer  are  loosely  coupled  together,  each 
one  controlled  by  its  own  clock  system,  which  in  turn  is  synchro- 
nized by  a  master  oscillator.  This  multiplexing  of  the  units  of  the 
computer  results  in  a  large  number  of  registers  and  adders,  since 


Chapter  34  [  The  engineering  design  of  the  Stretch  computer  425 


time-sharing  of  the  major  computer  organs  is  no  longer  possible. 
All  in  all,  the  computer  has  3,0(X)  register  positions  and  about  450 
adder  positions. 

Despite  the  multiplexing  and  simultaneous  operation  of  suc- 
cessive instructions,  the  result  appears  as  if  sequential  step-by-step 
internal  operation  were  utilized.  This  has  made  the  design  of  the 
interlocks  (juite  complex. 


Data  flow 

The  data  flow  through  the  computer  is  shown  in  Fig.  .5  and  is 
comparable  to  a  pipeline  which  in  a  steady  state  (namely,  once 
filled)  has  a  large  output  rate  no  matter  what  its  length.  The  same 
is  tme  here;  after  start-up  the  execution  of  the  instructions  is  fast 
and  bears  no  relation  at  all  to  the  stages  it  must  progress  through. 


DATA  WORD 


INSTRUCTION 


INSTRUCTION 
FETCH 


INSTRUCTION 
UPDATING 


DATA  WORD 
FETCH 


4  INSTRUCTIONS 


INSTRUCTION 
FETCH 


INSTRUCTION 
UPDATING 


4  DATA  WORDS 


I  I  1  I  I  I 


DATA  WORD  FETCH 


INSTRUCTION 
EXECUTION 


INSTRUCTION 
EXECUTION 


704 


STRETCH 


Fig.  4.  Comparison  of  Stretch  and  704  organization. 


426  Part  5  |  The  PMS  level 


Section  2     Computers  with  one  central  processor  and  multiple  input/output  processors 


INSTR  FETCH  ADR 


TO  EXCHANGE  0= 


OPERAND 
FETCH  ADR 


MEMORY  BUS 

ADDRESS 

DATA 

MEMORY  OUT  BUS 


MEMORY  IN  BUS 


RESULT  STORE  ADDRESS 


*3=^  FR  EXCHANGE 


INSTR  WORD  BUFFER 


INSTR  WORD  BUFFER 


INSTRUCTION  a 
INDEXING  UNIT 


CHECKER  OUT  BUS 


ERROR 
CORRECTOR 
CHECKER 

 5  


CHECKER  IN  BUS 


OPERAND  BUFFER 


OPERAND  BUFFER 


OPERAND  BUFFER 


OPERAND  BUFFER 


LOOK-AHEAD 


LA  TRANSFER  BUS 


INTERRUPT 
SYSTEM 


2  WORD 
ACCUMULATOR 
A,B 


RITH  CHECKER  OUT  BUS 


2  WORD 
OPERAND 
REGISTER 
C.D 


ARITHMETIC 
CHECK 


ARITH  CHECKER  IN  BUS 


SERIAL 
ARITH  UNIT 


PARALLEL 
ARITH  UNIT 


Fig.  5.  Stretch  computer— units  and  data  flow. 


The  Memory  Bus  is  the  communication  link  between  the  mem- 
ories on  one  side  and  the  exchanges  and  the  computer  on  the  other. 
It  monitors  the  requests  for  storage  to,  or  fetches  from,  memory, 
and  sets  up  a  priority  scheme.  Since  I/O  units  cannot  hold  up 
their  requests,  the  exchange  will  get  highest  priority,  followed  by 
the  computer.  In  the  computer  the  instruction-fetch  mechanism 
has  priority  over  the  operand-fetch  mechanism.  All  told,  the 
memory  bus  gets  requests  from  and  assigns  priority  to  eight  differ- 
ent channels. 

Since  memory  can  be  accessed  from  multiple  sources,  and  once 
accessed  it  is  on  its  own  to  complete  its  cycle,  a  busy  condition 
can  exist.  Here  again,  the  memory  bus  tests  for  busy  conditions 
and  delays  the  requesting  unit  until  memory  is  ready  to  be  inter- 


rogated on  data  fetches.  The  return  address  is  remembered  and 
the  requesting  unit  receives  the  information  when  it  becomes 
available.  To  accomplish  this,  from  the  time  information  is  re- 
quested the  receiving  data  register  is  in  a  reserved  status. 

Requests  for  stores  and  fetches  can  be  processed  at  a  200  miusec 
rate  and  the  time,  if  no  busy  or  priority  conditions  exist,  to  return 
the  word  to  the  requesting  unit  is  1.6  jisec,  a  direct  function  of 
the  memory  read-out  time. 

The  Instruction  Unit  [Blaauw,  1959]  is  a  computer  of  its  own. 
It  has  its  own  instruction  set,  its  own  small  memory  for  index  word 
storage,  and  its  own  arithmetic  unit.  During  its  operation  as  many 
as  six  instructions  can  be  at  various  stages  of  execution. 

The  Instruction  Unit  fetches  the  instruction  words  from  mem- 


Chapter  34  j  The  engineering  design  of  the  Stretch  computer  427 


ory,  it  steps  the  instruction  counter,  and  performs  the  indexing  of 
instructions  and  the  initiation  of  data  fetches.  After  a  prehminarv 
decoding  of  the  class  of  instruction,  it  recognizes  its  own  instruc- 
tions and  executes  indexing  instructions.  On  branches,  conditional 
or  unconditional,  the  instruction  unit  executes  these.  In  the  case 
of  conditional  branches,  it  makes  the  assumption  that  the  branch 
will  not  be  successful. 

This  assumption  and  the  availability  of  two  full-word  buffer 
registers  keep  the  flow  of  instruction  to  the  computer  continuous. 
Therefore,  the  rate  of  instructions  entering  the  instruction  unit 
is  for  all  practical  purposes  independent  of  the  memory  cycle. 

Since,  for  high  speed  instructions,  half-word  formats  are  used, 
four  of  these  at  any  one  time  can  be  in  bufiFer  storage.  As  soon 


as  the  instruction  unit  starts  processing  an  instniction,  it  is  re- 
moved from  the  buffer,  thus  making  room  for  the  next  memory- 
word  access  (Fig.  6).  Incidentally,  half-word  instructions  and 
full-word  instructions  can  be  intermixed  within  the  same  word, 
and  therefore  the  latter  can  cross  a  word  boundary.  This  permits 
maximum  packing  of  instructions  in  memory  and  also  serves  as 
a  facilitv  for  automatic  program  assemblers  and  compilers. 

The  adder  path,  index  registers,  and  transfer  bus  to  look-ahead 
complete  the  instruction  unit  system  (Fig.  6).  It  should  be  noted 
that  the  index  registers  are  part  of  the  instruction-unit  data  path, 
therefore  permitting  fast  access  (no  long  transmission  lines)  to  an 
index  word.  There  are  16  index  words  available  to  the  programmer. 
The  index  registers,  consisting  of  multi-aperture  cores,  are  oper- 


MEMORY  OUT  BUS 


LOOKAHEAD 


I  CHECKER  OUT  BUS 


INSTRUCTION 
COUNTER 


lY 

INST  REG 


2Y 
INST  REG 


INDEX 
WORD 
STORAGE 
(17W) 


I  ADDER  OUT  BUS 


X 

INDEX 
REGISTER 


INDEX  ADR 


ADDER  BUS  A 


ADDER  BUS  B 


LOOKAHEAD  LOAD 


CHECKER  IN  BUS 


MOD/EXEC 
REG 


BUS 


LINES 


W 

WORKING 
REGISTER 


MEMORY  ADDRESS  BUS 


I  AU 
INDEX  ARITH 
UNIT 


Fig.  6.  Instruction  unit. 


Part  5  I  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


ated  in  a  non-destnictive  fashion,  since  in  a  representative  pro- 
gram, the  index  word  is  used  nine  out  of  ten  times  without  modi- 
fying it.  This  permits  fast  operation  under  these  conditions,  and 
additional  time  is  only  applied  where  modification  is  involved. 

After  processing  through  the  instruction  unit,  the  updated  (in- 
dexed) instruction  enters  a  level  of  the  Look-ahead  (Fig.  5).  Besides 
the  instruction,  all  necessary  information,  its  associated  instruction 
counter  value,  and  certain  tag  information  are  also  stored  in  the 
same  level.  The  operand,  already  requested  by  the  instruction  unit, 
will  enter  this  level  directly  and  will  be  checked  and  error- 
corrected  while  awaiting  transfer  to  the  arithmetic  units  for  execu- 
tion. 

An  interlocked  counter  mechanism  in  the  look-ahead  keeps  its 
four  levels  in  step,  preventing  out-of-sequence  execution  of  in- 
structions, even  if  all  information  for  a  succeeding  one  is  available, 
before  the  previous  instniction  has  been  started. 

The  pre-accessing  of  operands  by  the  look-ahead  and  of  instruc- 
tions by  the  instruction  unit  leads  sometimes  to  embarrassing 
positions,  for  which  a  fix-up  routine  must  be  provided.  Consider 
the  program 

(n)  STORE  Accumulator  m 

(n  +  1)    LOAD  R 
(n  +  2)    ADD  m 

and  assume  instruction  (n)  is  in  look-ahead,  waiting  for  execution. 
If  (n  -t-  2)  now  enters  the  look-ahead,  a  reference  to  m  cannot  be 
made,  since  the  data  stored  in  that  position  is  subject  to  change 
by  the  STORE  instruction.  The  look-ahead  must  recognize  this 
and  "forward"  the  result  of  instruction  (ri),  when  received,  to  the 
level  where  (n  -I-  2)  is  stored. 

Another  example  is  the  case  where  the  instruction  unit  assumed 
that  a  conditional  branch  would  not  be  executed.  This  instniction 
is  stored  in  look-ahead  and,  when  it  is  recognized  that  the  branch 
was  successful,  all  modifications  of  addressable  registers  made  by 
the  instruction  unit  in  the  meantime  must  be  restored.  Look-ahead 
in  this  case  acts  as  a  recovery  memory  for  this  information.  A 
similar  condition  exists  when  interrupts  occur  due  to  arithmetic 
results.  The  look-ahead  here  again  has  the  data  stored  pertaining 
to  registers  which  were  modified  erroneously  in  the  meantime.  The 
restoring  and  recovery  routines  described  break  into  the  instruc- 
tion imit  processing,  interrupting  temporarily  the  flow  of  instruc- 
tion and  their  indexing. 

The  arithmetic  units  described  later  are  slaves  to  the  look- 
ahead,  receiving  not  only  operands  and  instruction  codes  but  also 
the  start -execution  signal.  Conversely,  the  arithmetic  units  signal 
to  the  look-ahead  the  termination  of  an  operation  and,  in  the  case 


of  "To  Memory"  operations,  place  into  the  look-ahead  the  result 
word  for  transfer  to  the  proper  memory  position. 

Arithmetic  units 

The  design  of  the  arithmetic  units  was  established  along  lines 
similar  to  the  design  of  look-ahead  and  the  instruction  unit.  Every 
attempt  was  made  to  speed  up  the  execution  of  arithmetic  opera- 
tions by  multiplexing  techniques  and  overlapping  of  the  algo- 
rithm, where  mathematically  permissible. 

The  arithmetic  units,  consisting  of  the  Serial  Unit  and  the 
Parallel  Unit,  use  the  same  arithmetic  registers,  namely  a  double- 
length  accumulator  (A,B)  consisting  of  128  bits  and  a  double-length 
operand  register  (C,D)  consisting  of  128  bits.  The  reason  for  the 
use  of  the  same  arithmetic  registers  is  the  fact  that  at  any  time, 
a  shift  from  floating-point  to  variable-field-length  operation  (or  vice 
versa)  can  be  made  by  the  program.  Therefore,  the  result  obtained 
by  a  floating-point  operation  can  serve  as  the  starting  operand  for 
a  variable-field-length  operation.  The  chief  reason  for  the  double- 
length  registers  is  the  definition  of  maximum  field  length  to  be 
64  bits.  The  field  can  start  with  any  bit  position,  and  therefore 
can  cross  the  word  boundary. 

The  executions  of  floating-point  mantissa  operations  and  varia- 
ble-field-length binary  multiply  and  divide  operations  are  per- 
formed by  the  parallel  unit,  whereas  the  floating-point  exponent 
operation  and  the  variable-field-length  binary  and  decimal  add- 
type  operations  are  executed  by  the  serial  unit.  The  square-root 
operation  and  the  binary-to-decimal  conversion  algorithm  are 
executed  in  unison  by  both  units.  Salient  features  of  the  two  units 
will  now  be  described. 

The  serial  arithmetic  unit  [Brooks  et  al.,  1959]  (Fig.  7).  The  serial 
arithmetic  consists  of  a  switch  matrix  which  can  extract  16  con- 
secutive bits  from  A,B  and  C,D.  These  16  bits  then  can  be  aligned 
in  such  a  way  that  the  low-order  bit  of  a  field  as  specified  by  the  in- 
struction is  at  the  right  end  of  the  field.  This  wrap-around  circuit 
then  feeds  into  a  carry-propagate  adder  or,  in  case  of  logical-con- 
nect  instructions,  into  the  logic  unit.  At  the  adder  output,  a  true 
complement  unit  and  a  binary-to-decimal  correction  unit  are  used 
for  subtract  and  decimal  operations.  The  inverse  process  of  ex- 
tracting is  used  to  insert  the  processed  byte  back  into  the  register 
without  disturbing  any  neighboring  positions.  Notice  that  in  one 
clock  cycle,  the  information  is  extracted,  the  arithmetic  is  per- 
formed and  the  result  inserted  back  into  the  registers.  In  addition, 
the  arithmetic  information  is  checked  by  parity  checks  on  the 
switch  matrices  and  by  duplication  and  comparison  of  the  arith- 
metic procedure  in  a  duplicate  unit. 


Chapter  34  ;  The  engineering  design  of  the  Stretch  computer  429 


FR  LOOK-AHEAD 


ACCUMULATORS 


8  BIT 
PASS  AROUND 


A 

B 

63  0 

63  0 

SWITCH 
MATRIX 
(16  OF  128) 


WRAP 
AROUND 

(8  OF  16) 


TRUE/COMP 
(8  BITS) 


BINARY 
ADDER 


LOGIC 
UNIT 


TRUE/COMP 


DECIMAL 
CORRECT 


A/  B 
WRITE  IN 
MATRIX 


SWITCH 
MATRIX 
16^16 


OPERAND  REGISTERS 


c 

D 

63  0 

63  0 

SWITCH 
MATRIX 
(16  OF  128) 


WRAP 
AROUND 
(8  OF  16) 


TRUE/COMP 
(8  BITS) 


C/O 
WRITE  IN 
MATRIX 


8  BIT 
PASS  AROUND 


Fig.  7.  Serial  arithmetic  unit. 


Parallel  arithmetic  unit.  The  parallel  arithmetic  unit  (Fig.  8)  is 
designed  to  execute  floating-point  operations  with  a  ma.\inium  of 
efficiency.  Since  both  single-  and  double-precision  arithmetic  is 
performed,  the  shifter  and  adder  exist  in  a  double-length  format 
of  96  bits.  This  insures  almost  the  same  performance  for  single- 
and  double-precision  arithmetic.  The  adder  is  of  a  carrv-propaga- 
tion  type  with  look-ahead  over  4  bits  at  a  time  to  reduce  the  delav 
that  normally  results  in  a  ripple-carrv  adder.  This  carry  look-ahead 
results  in  a  delay  time  of  150  m;tisec  for  96-bit  binary-number 


additions.  .\11  additions  and  subtractions  are  made  in  one's  com- 
plement form  with  automatic  end-around  carr}'. 

The  shifter  is  capable  of  shifting  up  to  4  positions  to  the  right 
and  up  to  6  positions  to  the  left.  This  shifter  arrangement  takes 
care  of  the  majority  of  shifting  operations  encountered  imder 
normal  operation.  Where  higher-order  shifts  are  required,  a  suc- 
cessive operation  is  set  up  between  the  parallel  unit  register  and 
the  shifter. 

To  expedite  the  execution  of  the  multiply  instruction,  12  bits 


Part  5  I  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


A,B 


C,D 


PARALLEL 
UNIT  REGISTER 


SHIFTER 


MPCD 

MPCD 

MPCD 

MPCD 

3  BITS 

3  BITS 

3  BITS 

3  BITS 

TRUE 
COMPLEMENT 


CARRY  PROPAGATE 
ADDER 
100  BITS 


CARRY  SAVE 
ADDER  1 


CSA  2 


CSA  3 


CSA  4 


SUM  REG 


CARRY  REG 


Fig.  8.  Floating-point  arithmetic  unit. 

of  the  multiplier  are  handled  within  one  cycle.  This  is  accom- 
plished by  breaking  the  12  bits  into  groups  of  three  bits  each.  The 
action  is  from  right  to  left  and  consists  of  decoding  each  group 
of  three  bits.  By  observing  the  lowest-order  bit  of  the  next  higher 
group,  a  decision  is  made  as  to  what  multiple  of  the  multiplicand 
one  must  add  to  the  partial  product.  Since  only  even  multiples 
of  the  multiplicand  are  available,  subtraction  and  addition  of  the 
multiples  can  result.  The  following  example  will  elaborate  this 
point:  (MCD  means  multiplicand) 


Octal  value 
3  6  5 

If  two  additions  of  multiples  were  permitted 
6  X  MCD 


4  X  MCD 
- 1  X  MCD 


6  X  MCD 
1  X  MCD 


2  X  MCD 


Instead  of  subtracting  1  X  MCD  in  n  +  1,  subtract  8  X  MCD  in  n. 


n  +  4 


n  +  3 


Groups 
n  +  -2 
Multiplier.  12  bit  group 

no 


4  X  MCD 


4  X  MCD 


6  X  MCD 
-8  X  MCD 

Resulting  decoding 

-2  X  MCD 


6  X  MCD 


6  X  MCD 


2  X  MCD 
-8  X  MCD 


-6  X  MCD 


The  four  multiple  multiplicand  groups  and  the  partial  product  of 
the  previous  cycle  are  now  fed  into  carry-save  adders  of  the  form, 


Sum     S  =  A  V  B  V  C 
Carry  C  =  AB  +  AC  +  BC 

There  are  four  of  these  adders,  two  in  parallel  followed  by  two 
more  in  series  (Fig.  8).  The  output  of  Carry-Save  Adder  4  then 
results  in  a  double-rank  partial  product,  the  product  sum  and  the 
product  carry.  For  each  cycle  this  is  fed  into  Carry-Save  Adder 
2,  and,  during  the  last  cycle,  into  the  carry-propagate  adder,  for 
accumulation  of  the  carries.  Since  no  propagation  of  carries  is 
required  in  the  four  cycles,  where  multiple  nuiltiplicands  are 
added,  this  operation  is  fast  and  is  the  main  contrihutor  to  the 
fast  multiply-time  of  Stretch. 

The  divide  scheme  [Robertson,  1958]  has  a  similarity  to  the 
multiply  scheme.  Multiples  of  the  divisor  are  >ised,  namely, 
3/2  X  divisor,  3/4  X  divisor  and  1  X  divisor.  This,  plus  shifting 
over  strings  of  ones  and  zeros,  results  in  the  generation  of  the 
required  48  quotient  bits  within  thirteen  machine  cycles.  Most 
machines  using  a  nonrestoring  divide  method  require  48  cvcles 
for  48  quotient  bits.  The  following  example  explains  this  technique. 
This  scheme  depends  on  the  use  of  normalized  divisors: 

DIVIDEND     (DD)  =  1()I()()()()()()00( )()()() 
DIVISOR         (DR)  =  IIOOOII 
2's  COMP  DR  (M)  =  0011101 
3/4  DR  =  100101001 

(o)  Vsino  skip  over  1/0  onli/: 

lOlOOOOOOOOOOOO  DIVIDEND 
Step  1:    0011101  ADD  M 

1101 101 

Remainder  negative,  1st  quotient  bit  =  0;  shift  one  position. 
Leading  1  indicates  that  next  quotient  bit  must  be  1;  QjQo 
=  01 

011010000  REMAINDER 
Step  2:      1100011  ADD  DR 

10010111 

Overflow:  Remainder  positive  and  (J,  =  1,  leading  zero  indicates 
ft  =  0 

1011100  REMAINDER 
Step  3:      0011101  ADD  DR 

1111001 

Negative  remainder;  =  0;  leading  I's  indicate  QcQyQs  =111 
Number  of  quotient  bits  per  cvcle: 


Chapter  34  |  The  engineering  design  of  the  Stretch  computer  431 


Cycle  1:  01  =  2 
Cycle  2:  10  =  2 
Cycle  3:    0111  =4 

ih\  The  same  problem  with  both  skip  over  10  and  3/4  —  3/2 
complement: 

lOlOOOOOOOOOOOO 
Step  1:  0011101 

IIOIIOIOOOO 

Same  as  before.  OjQ-.  =  01 

^      ,     100101001  Add  3/4  DR 

111111001 

This  ib\  table  look-up)  indicates  O/^fQrShQT^h  -  1""111 

Ouolieiit  bits  generated  per  cvcle: 

Cycle  1:  01=2 
Cycle  2:     100111  =  fi 

In  general,  this  method  results  in  the  generation  of  3.7  quotient 
bits  per  subtraction.  While  the  mantissa  operations  of  multiply 
and  divide  are  performed  by  the  parallel  unit,  the  serial  arithmetic 
unit  executes  the  exponent  arithmetic.  Here  again  is  a  case  where 
overlap  and  simultaneitv  of  operation  is  used  to  special  advantage. 

Cheekinfi.  The  operation  of  the  computer  is  checked  in  its  entiretv 
and  correction  codes  are  emploved  where  data  transfers  from 
memor\'  and  input-output  units  are  involved.  In  particular,  all 
information  sent  to  memory  has  a  correction  code  associated  with 
it.  which  is  checked  for  accurac\'  on  its  way  from  memory.  If  a 
single  error  is  indicated,  then  correction  is  made  and  the  error 
is  recorded  via  a  maintenance  output  device.  Within  the  machine, 
all  arithmetic  operations  are  checked,  either  bv  paritv,  duplica- 
tion, or  a  "casting  out  three"  process.  These  checks  are  overlapped 
with  the  execution  of  the  next  instniction. 

Hardware  courU.  Figure  9  shows  the  percentage  of  transistors  used 
in  the  various  sections  of  the  machine.  It  becomes  obvious  that 
the  parallel  unit  and  the  instruction  unit  use  the  highest  percent- 
age of  transistors.  In  case  of  the  parallel  unit  this  is  due  to  the 
extensive  circuits  for  multiplv  and  to  the  additional  hardware  to 
achieve  speed  up  of  the  divide  scheme.  In  the  instruction  unit, 
the  controls  consume  tlie  majority  of  the  transistors,  because  of 
the  high  multiplexed  operation  encountered. 

Perfonnance.  The  performance  comparisons  in  Fig.  10  show  the 
increase  in  speed  achieved,  especiallv  in  floating-point  operations. 


432  Part  5  |  The  PMS  level 


Section  2     Computers  with  one  central  processor  and  multiple  input/output  processors 


UNIT 

NO,  OF  TRANSISTORS 

%  OF  TOTAL 

NO.  OF  FRAMES 

MEMORY  CONTROLS 

10,500 

6.0 

2 

INSTRUCTION  UNIT 

DATA  PATH 
CONTROLS 

17,700 
19,500 

22.0 

2 

3-1/2 

LOOK- AHEAD 

DATA  PATH 
CONTROLS 

17,900 
8,600 

15.6 

1 

1-1/2 

ARITH  REGISTERS 

10,000 

5.9 

1 

SERIAL  ARITH  UNIT 

DATA  PATH 
CONTROLS 

10,000 
8,700 

10.5 

1  -  1/2 

1 

FLOATING  PT  UNIT 

DATA  PATH 
CONTROLS 

32,700 
3,000 

21.0 

2-1/2 
1/2 

CHECKING 

24,500 

14.5 

1 

INTERRUPT  SYSTEM 

6,000 

3.5 

1/2 

TOTAL 

169,100 

100.0 

18 

DOUBLE  CARDS  4,025 
SINGLE  CARDS  18,747 
POWER                       21  KW 

Fig.  9.  Component  count. 


over  the  704.  It  should  be  noted  that  for  a  large  number  of  prob- 
lems this  particular  increase  in  all  arithmetic  speeds  is  almost 
proportional  to  the  performance  increase  of  the  problem  as  a 
whole,  since  the  instruction  execution-times  are  overlapped  to  a 
great  extent  with  the  preparation  and  fetching  of  instructions. 


Simulation  of  Stretch  programs  on  the  704  proved  a  performance 
of  100  X  704  speed  in  mesh-type  calculations.  Higher  performance 
figures  are  achieved  where  double-  or  triple-precision  calculations 
are  required. 


Chapter  34  |  The  engineering  design  of  the  Stretch  computer  433 


Circuits 

Having  reviewed  the  systems  organization  of  Stretch,  it  is  now 
of  interest  to  discuss  briefly  the  components,  circuits,  and  packag- 
ing techniques  used  to  implement  the  design. 

The  basic  component  used  in  Stretch  is  the  high-speed  drift 
transistor  which  exists  in  both  an  NPN  and  a  PNP  version.  This 
transistor  has  a  frequency  cut-off  of  approximately  100  mc  and 


for  high-speed  operation  must  be  kept  out  of  saturation  at  all  times. 
This  then  explains  why  both  the  PNP  and  NPN  version  are  used: 
mainly  to  avoid  the  problem  of  level  translation,  which  would  be 
retjuired  due  to  the  potential  difference  of  the  base  and  the  col- 
lector. This  difference  is  6  volts,  an  optimum  point  for  this  device. 

Figure  11  shows  the  basic  circuit  configuration.  It  consists  of 
a  current  source,  represented  by  the  —.30  volt  supply  and  resistor 
R.  The  fvmctional  operation  of  the  circuits  consists  of  two  possible 


OPERATION 

IBM 
704 

IBM 
705 

STRETCH 

1.    FLOATING  POINT 

EXPONENT  RANGE 
MANTISSA  BITS 
FLOATING  ADD 
FLOATING  MPY 
FLOATING  DIV 
LOAD/STORE 

t  128 

12 
27 

84  fiSEC 
204  fiSEC 
216  ^SEC 

24  ^SEC 

±2048 

tz 

48 
1.0  ^SEC 
1.8/iSEC 
7.0^SEC 
O.e^SEC 

2. 

BINARY  VARIABLE 

FIELD  LENGTH  ARITH 

16 

BIT 

FIELD 

BIT  RANGE 

-  ADD/LOAD/STORE 
MPY 

-  DIVIDE 

1  TO  64 
2.0/iSEC 
10.0  /iSEC 
15.0  fisec 

3. 

DECIMAL 
ARITHMETIC 

FOR 

DIGITS 

DIGIT  RANGE 

-  ADD 
MPY 
DIVIDE 

-  LOAD/STORE 

1  — MEM  CAPACITY 
119  ^SEC 
799  /iSEC 
4828  /i.SEC 
204  ^SEC 

1  TO  21 
3.5/iSEC 
40.0/i.SEC 
65.  0  /iSEC 
3.2/iSEC 

4. 

MISCELLANEOUS 

ERROR  CORRECTION 

CHECKING 

WORD  SIZE 

NO 
NO 

36  BITS 

NO 
YES 

YES 
YES 
64  BITS 

Fig.  10.  Comparison  of  Stretch  and  705/704  operation  times. 


434  Part  5  |  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


SYMBOL 


A-  B 
(/)  A-  B 


TRUTH 
TABLE 


A 

B 

4> 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

<^  =  A-  B 

^=  A  -  B 
=  A+B 


CIRCUIT 
DIAGRAM 
An 


82 


82 


I 


C>  P 


^87^1  |^2.15K 


6MA  f  e  4.5K=R 
i  +30 


^82 


o  <f>  (A-  B) 


OUTPUT 

INPUT  '//////// 

ZZZZZZZZZZr.l^  M2^-5.6v 

S IG  N  Al 

VOLTAGES   OV  REF  6V 


•^v  zzzzzzzz2:^-^5^ 

8V 


CIRCUIT 
RESPONSE 


DELAY  »  20M/J. SEC 
-  OUTPUT 


Fig.  11.  Current  switching  circuits  (  +  AND). 


Chapter  34  |  The  engineering  design  of  the  Stretch  computer  435 


Am 

SYMBOL 


■UL 


82 
A  o— 


A 

□ 

X 

9 

ni 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

-1- 

+ 

+ 

+ 

+ 

+ 

+ 

A 
M 

O 

D 

X 

9 

TTT 

ILL 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

-t- 

4- 

+ 

+ 

+ 

-t- 

+ 

TRUTH  TABLES 


82 


Bo-^,A^  N 


-6 


-o  <f> 


12 


CIRCUIT 
(An) 


I 


82 


Xo-/V^  N 


5187     ^2  15K 


4.5K  =  R 
+  30 


^^2 


CIRCUIT  A 


Fig.  12.  Third-level  circuit. 


436  Part  5  |  The  PMS  level 


Section  2     Computers  with  one  central  processor  and  multiple  input/output  processors 


CIRCUIT 


A  


B  


>A+  B 


TRUTH  TABLES 


A 

B 

A 

B 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

AND 

+ 

+ 

+ 

+ 

+ 

OR 


-6V 


CIRCUIT 
DIAGRAM 

An 
On 


T  —  T 

A    i     "nI  B— A- 


i5on 


+  6V 
I 


N 

B 

,442ft       ^63. 4ft 

 oA-B 

+  6V  i-l.ZIK 


■6V 


-6V 


A+B 


+6V 


+  1.1 


MIN-  MAX 
SIGNAL  VOLTAGES 


rTj  +.95 


REF  • 


■GND 


REF 


.35 
-GND 


BEG  OF  V 
CHAIN  y 


^-1.0 


■-.35 

END  OF 
CHAIN  (4) 


CIRCUIT  RESPONSE 


DELAYS  lOM^SEC 


Fig.  13.  Emitter-follower  circuit. 


Chapter  34  |  The  engineering  design  of  the  Stretch  computer  437 


paths  represented  by  transistor  A  or  C.  Which  path  is  chosen  by 
the  current  depends  on  the  condition  existing  on  base  A.  If  point 
A  is  positive  with  respect  to  ground  by  0.4  volts,  that  particular 
transistor  is  cut  off,  making  the  emitter  of  transistor  C  positive 
with  respect  to  the  base  and,  therefore,  making  C  conducting.  The 
current  supplied  by  the  current  source  (6  ma)  will  then  flow 
through  transistor  C  to  the  load  <>.  Output  <;>,  then,  is  positive  by 
0.4  volts  with  respect  to  the  —6  volt  reference.  This  indicates  at 
<f)  the  equivalent  function  impressed  on  A.  At  the  same  time,  <<> 
is  negative  with  respect  to  the  —6  volt  power  supply  by  0.4  volt, 
representing,  therefore,  the  inverse  of  the  fimction  impressed  on 
A.  Conversely  if  A  is  negative  with  respect  to  the  ground  reference, 
transistor  A  is  the  conducting  one,  keeping  emitter  C  negative  with 
respect  to  its  base.  The  current  flows  through  transistor  A,  making 
<l>  positive  with  respect  to  —6  and  4>  negative  with  respect  to  —6. 
Again,  the  output  of  <f>  reflects  the  function  impressed  on  A, 
whereas     represents  the  inverse  of  the  fimction. 

If  an  additional  transistor  now  is  paralleled  with  A.  it  becomes 
obvious  that  onlv  if  both  bases  A  and  B  are  positive  will  output 


(>  be  positive  and  (>  negative.  If  any  or  none  of  the  bases  A  and 
B  are  positive,  then  <>  will  be  negative  and  <^  will  be  positive.  In 
other  words,  an  AND  fimction  is  obtained  on  output  <l). 

This  principle,  which  is  reflected  in  all  the  circuits,  is  essen- 
tiallv  the  principle  of  current  switching  or  current  steering. 

Logical  fimctions  for  the  PXP  circuits  are,  therefore,  a  +.^0 
or  —OR.  Two  outputs  from  each  circuit  block  are  available:  the 
AND  fimction  and  the  inverse  of  the  AND  function. 

.\  dual  circuit  exists  for  NPN  transistors  with  input  levels  at 
—  6  volts  and  output  levels  at  ground.  This  circuit  will  give  the 
+  OR  or  —  .\ND  fimction. 

A  thorough  investigation  of  the  systems  design  showed  that  the 
circuits  described  so  far  are  versatile  enough  to  be  used  throughout 
the  system.  However,  there  are  enough  special  cases  (resulting 
from  the  many  data  buses  and  registers  throughout  the  machine) 
that  could  use  a  distributor  function  or  an  overriding  function. 
This  caused  the  design  of  a  circuit  which  permitted  great  savings 
in  space  and  transistors  by  adding  a  third  voltage  level.  Figure 
12  shows  the  PNP  version  of  the  third-level  circuit. 


Fig.  14.  The  circuit  package. 


438  Pari  5  |  The  PMS  level 


Section  2  |  Computers  with  one  central  processor  and  multiple  input/output  processors 


If  transistor  A'  were  eliminated,  then  transistors  A  and  B  in 
conjunction  with  the  reference  transistor  C  would  work  normally 
as  a  current  switching  circuit,  in  this  case  a  +AND  circuit.  If 
transistor  X  is  added  with  the  stipulation  that  the  down  level  of 
X  is  more  negative  than  the  lowest  possible  level  of  A  or  B,  it 
becomes  apparent  that  when  A'  is  negative,  the  current  will  flow 
through  that  branch  of  the  circuit  in  preference  to  branch  tj>  or 
<p,  regardless  of  inputs  A  and  B.  Therefore,  the  output  of  <j>  and 
<j)  will  be  negative,  provided  input  X  is  negative.  Output  ///  is 
the  inverse  of  input  X.  If,  however,  X  is  positive,  then  the  status 
of  A  and  B  will  determine  the  function  <J>  and  <j>  implicitly.  This 
demonstrates  the  overriding  function  of  input  X. 

Similarly,  the  NPN  version  (not  shown)  results  in  the  OR 
function  of  (p  if  input  A  is  negative  and  in  a  positive  output  at 
(j>  and  (j>,  regardless  of  status  A  and  B,  if  .V  is  positive.  Again 
minimum  and  maximum  signal  swings  are  shown  in  Fig.  12. 

The  speed  of  the  circuits  described  so  far  depends  on  the 
number  of  inputs  and  the  number  of  circuits  driven  from  each 
load.  The  response  of  the  circuit  is  anywhere  between  12  and  25 
m|usec  per  logical  step  with  18  to  20  m/nsec  average.  The  number 
of  inputs  allowable  per  circuit  is  eight.  The  number  of  driven 
circuits  is  three.  Additional  circuits  are  needed  to  drive  more  than 
three  bases  and  where  current  switching  circuits  communicate 
over  long  lines,  termination  networks  must  be  added  to  avoid 
reflections. 

To  improve  the  performance  of  the  computer  in  certain  critical 
places,  emitter-follower  logic  is  used  as  shown  in  Fig.  1.3.  These 
circuits,  having  a  gain  less  than  one,  after  a  number  of  stages 
require  the  use  of  current  switching  circuits  as  level  setters  and 
gain  devices.  Both  AND  and  OR  circuits  are  available  for  both 
a  ground-level  and  a  —  6-level  input.  Change  from  a  —  6-level 
circuit  to  a  ground-level  circuit  is  obtained  by  applying  the  ap- 
propriate power  supply  levels.  Due  to  the  variations  in  inputs  and 
driven  loads,  the  circuits  must  be  designed  so  that  the  load  can 
vary  over  a  wide  range.  This  resulted  in  instability  which  had  to 
be  offset  by  the  feedback  capacitor  C  shown  in  the  circuit. 

All  functions  needed  in  the  computer  can  be  implemented  bv 
the  use  of  the  aforementioned  circuits,  including  flip-flop  opera- 
tion, which  is  obtained  by  tying  a  PNP  current  switch  block  and 
an  NPN  current  switch  block  together'  with  proper  feedback. 

Packaging 

The  circuits  described  in  the  last  paragraph  are  packaged  in  two 
ways: 


A  circuit  package  using  the  smaller  of  the  two  printed  circuit 
boards  shown  in  Fig.  14,  called  a  single  card,  contains  AND  or 
OR  circuits.  It  should  be  mentioned  that  the  printed  wiring  is 
one-sided  and  that  besides  the  components  and  transistors,  a  rail 
is  added  which  permits  the  shorting  or  addition  of  certain  loads 
depending  on  the  use  of  the  circuits.  This  rail  then  has  the  effect 
of  reducing  the  different  types  of  circuit  boards  in  the  machine. 
Twenty-four  different  boards  are  used  and  of  these,  two  tvpes 
reflect  approximately  70%  of  the  total  single  card  population. 

Due  to  the  large  number  of  registers,  adders,  and  shifters  used 
in  the  computer,  it  seems  reasonable  that  functional  packages 
could  be  employed  economically,  because  of  wide  usage.  This 
results  in  the  high-density  package  also  shown  in  Fig.  14,  called 


Fig.  15.  The  back  panel. 


Chapter  34  |  The  engineering  design  of  the  Stretch  computer  439 


a  Double  Card,  which  has  4  times  the  capacit)'  of  a  single  card 
and  which  has  wiring  on  both  sides  of  the  board.  Furthermore, 
components  are  double-stacked;  and  again,  the  rail  is  used  to  effect 
circuit  variations  due  to  different  applications.  Eighteen  double 
card  types  are  used  in  the  system.  Approximately  4,000  double 
cards  are  used,  housing  60%  of  the  transistors.  The  rest  of  the 
transistors  are  on  appro.xiniately  18,000  single  cards. 

The  cards,  both  single  and  double,  are  assembled  in  gates,  and 
two  gates  are  assembled  into  a  frame.  Figure  15  shows  the  gate 
back-panel  wiring,  using  wire-wraps;  and  Figs.  16  and  IT  the  frame 
constniction.  both  in  a  closed  and  open  version. 

To  achieve  high  performance,  special  emphasis  must  be  placed 
on  keeping  noise  to  a  low  level.  This  required  the  use  of  a  plane 


Fig.  16.  The  frame  (closed). 


Fig.  17.  The  frame  (extended). 


which  overlies  the  whole  back  panel,  against  which  the  intercircuit 
wiring  is  laid.  In  addition,  the  power-supply  distribution  svstem 
must  be  of  such  a  low  impedance  that  e.xtraneous  noise  cannot 
induce  circuit  malfimction.  For  this  reason,  a  bus  svstem,  consist- 
ing of  laminated  copper  sheets,  is  used  to  distribute  the  power 
to  each  row  of  card  sockets.  The  « iring  iiiles  are  such  that  single- 
conductor  wire  is  used  up  to  a  ma.ximum  of  24",  twisted  pair  to 
a  maximum  of  36",  unterminated  coax  to  a  maximum  of  60",  and 
terminated  coax  to  a  maximum  of  100  feet.  The  whole  back-panel 
construction  and  the  application  of  single  wire,  twisted  pair,  or 
coax  are  calculated  bv  a  computer  program  to  minimize  the  noise 
on  each  circuit  node. 

The  two  gates  of  a  frame  are  a  sliding  pair  with  the  power 
supply  mounted  on  the  sliding  portion.  .\11  connecting  wires 
between  frames  are  coax  and  arrayed  in  lavers  which  are  formed 
into  a  drape. 

References 

BlaaG59:  BrooF57a.  .59:  BuchW'oS;  DunwS.56;  RobeJ.58;  BlosKfiO: 
BuchWoT,  62;  BrooF60;  CockJ59;  CoddE59,  62. 


Chapter  35 


PILOT,  the  NBS  multicomputer  system^ 

A.  L.  Leiner  /  W.  A.  Notz  /  J.  L.  Smith 
A.  Weinberger 

Summary  PILOT,  the  new  NBS  system,  possesses  both  powerful  external 
control  capabilities  and  versatile  internal  processing  capabilities.  It  contains 
three  independently  operating  computers.  The  primary  and  secondary 
computers  each  utilize  only  16  basic  types  of  instructions,  thus  providing 
a  simple  code  structure;  but  because  so  many  variations  of  the  formats 
are  possible,  a  wide  variety  of  computing,  data-processing,  and  informa- 
tion-retrieval operations  can  be  performed  with  these  instructions.  The 
secondary  computer  is  specially  adapted  for  performing  so-called  "red- 
tape"  operations,  and  both  the  secondary  and  the  primary  computers,  acting 
co-operatively,  can  carry  out  special  complex  sorting  or  search  operations. 
The  third  computer  in  the  system,  called  the  format  controller,  is  specially 
adapted  for  performing  editing,  inspecting,  and  format  modifying  opera- 
tions. The  system  is  equipped  to  transfer  information  concurrently  along 
several  input-output  trunks,  though  only  two  are  planned  for  the  near 
future.  Using  two  such  tnmks,  it  is  possible  to  maintain  two  continuous 
streams  of  data  simultaneously  flowing  between  any  two  external  units  and 
the  internal  memory,  without  interrupting  the  data-processing  program. 
The  system  can  operate  with  a  wide  variety  of  input-output  devices,  both 
digital  and  analog,  either  proximate  or  remotely  located.  The  external 
control  capabilities  of  the  system  enable  the  machine  to  supervise  this  wide 
family  of  external  devices  and,  on  an  unscheduled  basis,  to  interrupt  or 
redirect  its  overall  program  automatically,  in  order  to  assist  or  manage 
them. 

At  the  National  Bureau  of  Standards  (NBS)  a  new  large-scale 
digital  system  has  been  designed  for  carrying  out  a  wide  range 
of  experimental  investigations  that  are  of  special  importance  to 
the  Government.  The  system  can  be  utilized  for  investigating  new 
or  stringent  applications  of  these  general  types:  (1)  data-processing 
applications,  in  which  the  system  can  be  used  for  performing 
accounting  and  information-retrieval  operations  for  management 
purposes;  (2)  mathematical  applications,  in  which  the  system  can 
be  used  for  performing  mathematical  calculations  for  scientific 
purposes,  including  scientific  data-reduction;  (.3)  control  applica- 
tions, in  which  the  system  can  be  used  for  performing  real-time 
control  and  simulation  operations,  in  conjunction  with  analog 
computer  facilities  or  in  conjunction  with  other  instalment  instal- 
lations, remotely  located  if  necessary;  and  (4)  network  applications, 

'Proc.  EJCC,  71-75  (1958). 


in  which  the  system  can  be  used  in  conjunction  with  other  digital 
computer  facilities,  forming  an  interconnected  communication 
network  in  which  all  the  machines  can  work  together  collabora- 
tively on  large-scale  problems  that  are  beyond  the  reach  of  any 
single  machine. 

Because  the  system  was  designed  for  such  varied  uses  (ranging 
from  automatic  search  and  interpretation  of  Patent  Office  records 
to  real-time  scheduling  and  control  of  commercial  aircraft  traffic), 
the  system  is  characterized  by  a  variety  of  features  not  ordinarily 
associated  with  a  single  installation,  namely:  a  high  computation 
rate,  highly  flexible  control  facilities  for  communicating  with  the 
outside  world,  and  a  wide  repertoire  of  internal  processing  formats. 
The  system  contains  three  independently  programmed  computers, 
each  of  which  is  specially  adapted  for  performing  certain  classes 
of  operations  that  frequently  occur  in  large-scale  data-processing 
applications.  These  computers  intercommunicate  in  a  way  that 
permits  all  three  of  them  to  work  together  concurrently  on  a 
common  problem.  The  system  thus  provides  a  working  model  of 
an  integrated  multicomputer  network. 

System  organization 

Exclusive  of  data-storage  and  peripheral  equipment,  the  central 
processing  and  control  units  of  the  over-all  system  contain  ap- 
proximately 7,000  vacuum  tubes  and  165,000  solid-state  diodes. 
The  basic  component  for  these  units  is  a  modified  version  of  the 
one  megacycle  package  used  in  the  NBS  DYSEAC,  which  in  turn 
was  evolved  from  the  hardware  used  in  NBS  Electronic  Automatic 
Computer  (SEAC).  As  a  result  of  a  more  effective  logical  design 
and  faster  memory,  however,  the  new  NBS  system  will  run  more 
than  100  times  faster  than  SEAC  on  programs  involving  only 
fixed-point  operations;  for  programs  involving  floating-point  ma- 
nipulations, the  advantage  exceeds  1,000.  The  arithmetic  speed 
of  the  new  system  derives  in  a  large  part  from  connecting  a  novel 
type  of  parallel  adder  to  a  diode-capacitor  memory  capable  of 
providing  one  random  access  per  microsecond. 

The  system  contains  seven  major  blocks,  which  are  indicated 
in  Fig.  1,  namely:  (1)  the  primary  computer,  in  the  lower  center 


440 


Chapter  35  [  PILOT,  the  NBS  multicomputer  system  441 


Table  1    Arithmetic  operation  times 

(including  4  random  access  times  to  last  memory) 

Total  tinw 
(microseconds) 

Minimum- 
Operation  Average  maximum 

Fixed-point  Addition.  Subtraction,  Comparison       7.5  6-9 

Fixed-point  Multiplication   31  22-40 

Fixed-point  Division    73  72-74 

Floating-point  Addition.  Subtractiont   20  19-21 

Floating-point  Multiplication   37   28-46 

t  For  shift  of  4  bits. 

of  the  figure,  (2)  the  primary  storage,  upper  center;  (3)  the  second- 
ary computer  and  the  secondary  storage,  right;  (4)  the  input-output 
control,  upper  left;  (5)  the  external  storage  units,  upper  far  left; 
(6)  the  external  input-output  units  such  as  readers,  printers,  and 
displays,  lower  far  left;  and  (7)  lower  left,  the  external  control 
containing  the  special  features  that  facilitate  communication  with 
people  and  devices  in  the  world  outside  the  system  which  is 
remotely  located  if  necessary.  Interchanges  of  information  between 
the  system  and  the  outside  world  can  take  place  at  any  time,  on 


a  completely  impromptu  basis,  at  the  instigation  of  eidier  the 
system  or  the  external  world,  or  both  acting  jointly. 

The  primary  computer,  a  high-speed  general-purpose  com- 
puter, contains  both  an  arithmetic  unit  and  a  program  control  unit 
of  considerable  versatility.  This  computer  can  carry  out  a  variety 
of  high  precision  arithmetic  and  logical  processing  operations,  in 
either  binary  or  decimal  code  and  in  a  wide  variety  of  word  lengths 
and  formats.  Its  partner  computer,  the  secondar\'  computer,  spe- 
cializes in  short -word  operations,  usually  manipulations  on  address 
numbers  or  other  "red-tape"  information,  which  it  supplies  auto- 
matically as  needed  to  the  primary  program.  The  third  computer 
of  the  system,  called  the  format  controller  (see  input-output  con- 
trol in  Fig.  1),  is  specially  designed  for  carrying  out  editing, 
inspecting,  and  format-modifying  operations  on  data  that  are 
flowing  in  or  out  of  the  internal  memory  via  the  peripheral  external 
imits  of  the  system.  All  three  computers,  and  all  the  external  units 
of  the  system,  share  access  privileges  to  the  common  high-speed 
internal  memory,  which  is  linked  to  the  input-output  and  external 
storage  units  via  independent  tnmks  for  effecting  data-transfers. 
Transfers  of  data  can  take  place  between  the  external  units,  the 
memorv  units,  and  the  computers  concurrently  without  interrupt- 
ing the  progress  of  the  computational  program.  Because  of  the 
flexibilitv  of  the  format  controller,  incoming  data  can  be  accepted 


NBS  PILOT  ELECTRONIC  DATA-PROCESSER 


EXTERNAL  STORAGE  UNITS 


INPUT -OUTPUT  CONTROL 


CONCUHRENi 

DATA 
TRANSFERS 


INPUT  READERS 


CONCURRENT 
DATA 
TRANSFERS 


CONCURRENT 
DATA 
TRANSFERS 


MANUAL , 
INPUT-OUTPUT  a 
REMOTE  CONTROLS 


INPUT-OUTPUT  UNITS 


EXTERNAL  CONTROL 


PRIMARY  STORAGE 


HIGH  SPEED  INTERNAL  MEMORY 


68- BIT  WORDS 


32.768  TOTAL 

ADORESSIBLE  STORAGE  VHOROS 


SECONDARY  STORAGE 


HIGH  SftED  INTERNAL  MEMORY 


16-BIT  WORDS, 
60  STORAGE  LOCATIONS 


ADDRESS  DATA  FOR 


ARITHMETIC  a 
PROCESSING  UNIT 

PROGRAM 
CONTROL  UNIT 

BINARY  a  DECIMAL . 
FIXED  a  FLOATING 
POINTS. 

FUaBHALF  WfiDS 
(16  VARIETIES! 

THREE-ADDRESS 
INSTRUCTION 
SY5TEK 

SEQUENTIAL  NEXT 
INSTRUCTION 
ADDRESS 

(16  BASIC  TYPESI 

EXPUCIT  NEXT 
INSTHUCnON 
ADDRESS 


BINARY, 
16-BIT, 
FIXED  POINT 


PRIMARY  COMPUTER 


SECONDARY  COMPUTER 


Fig.  1.  Over-all  block  diagram  for  PILOT. 


il 


Part  5  I  The  PMS  level 


Section  2  j  Computers  with  one  central  processor  and  multiple  input/output  processors 


from  a  wide  variety  of  external  devices  and  in  a  wide  variety  of 
formats. 

Functions  of  the  major  units 

The  specific  functions  of  the  major  units  can  be  described  briefly 
as  follows: 

Primary  computer 

Arithmetic  and  processing  unit.  Using  a  64-bit  number  word  with 
algebraic  sign,  this  unit  carries  out  7  different  types  of  arithmetical 
operations,  5  types  of  choice  (branch)  operations,  and  2  types  of 
logical  pattern-processing  operations.  See  Table  2.  Arithmetical 
operations  can  be  performed  in  anv  of  16  possible  formats.  For 
example,  arithmetic  can  be  performed  using  either  a  pure  binary 
or  a  binary-coded  decimal  number  code,  and  in  both  fixed-point 
and  floating-point  notation.  Fixed-point  operations  can  also  be 
carried  out  in  a  special  half-word  format  in  which  two  independ- 
ently addressable  half-words  are  stored  in  a  single  full-word  storage 
location.  These  two  half-words  can  be  processed  either  separately, 
as  independent  words,  or  concurrently  in  duplex  format.  In  duplex 

Table  2    Types  of  Internal  operations 


Primary  computer 

Name  Abhrniation 


Arithmetic  operations: 

Add  AD 

Augment  AG 

Subtract  SB 

Multiply  MP 

Divide  DV 

Square-root  SQ 

Shift  SH 

Nonnumerical  processing  operations: 

Transplant  Segment  with  Shift  TL 

Generate  Boolean  Functions  GB 

Choice  operations: 

Compare,  Algebraic  CA 

Compare,  Modulus  CM 

Compare,  Equality  CE 

Check  Scale  CS 

Compare  Boolean  Functions  CB 

Control  operations: 

Transfer  Between  Storage  Units  TS 

Regulate  Secondary  Computer  RS 


format,  the  respective  lefthand  and  righthand  halves  of  each 
double  operand  are  processed  simultaneously  in  a  single  instruc- 
tion time,  and  the  two  independent  half-word  results  are  written 
back  in  the  corresponding  halves  of  the  full-length  result  location. 

Program  control  unit.  The  program  control  unit  interprets  and 
regulates  the  sequencing  of  instructions  in  the  program.  It  operates 
with  a  68-bit  binary-coded  .3-address  instruction  word.  See  Table 
3.  Each  instruction  word  contains  three  16-bit  codes  which  specify 
the  addresses  of  each  of  two  operands,  alpha  and  beta,  and  usually 
the  address  of  the  result  of  the  operation,  gamma,  in  the  main 
memory.  The  memory  location  of  the  next  instruction  word  is 
specified  by  a  16-bit  address  number  contained  in  one  of  16  possi- 
ble base  registers;  a  4-bit  code  in  the  instmction  word  (rf-digits) 
specifies  which  one  of  the  base  registers  contains  the  desired  word. 
Whenever  a  register  is  so  used  as  a  next-instruction  address  source, 
its  contents  are  automatically  increased  by  unity.  Choice  instruc- 
tions, used  for  program  branching,  from  time  to  time  may  cause 
a  new  alternative  address  number  to  be  inserted  in  any  one  of 
the  base  registers.  This  register  is  then  used  as  the  source  of  the 
address  number  of  the  next  instruction. 


Secondary  computer 

A'oiiif 

Abbreviation 

Clear  add 

ca 

Hold  add 

ha 

Store  positive 

sp 

Transfer 

tr 

Increase 

in 

Decrease 

de 

Logical  Multiply 

Im 

Compare,  Zero 

cz 

Compare,  Righthand  Bit 

or 

Compare,  Lefthand  Bit 

cl 

Compare,  Negative 

cn 

Check  Primary  and  Proceed 

cp 

Check  Primary  and  Wait 

cw 

Regulate  Primary  Computer 

rp 

Replace  Primary  Instruction 

rl 

Secondary  Take  Input  from  Primary 

si 

Leiner,  Notz,  Smith,  Weinberger — PILOT 


Chapter  35  j  PILOT,  the  NBS  multicomputer  system  443 


Table  3    Contents  of  primary  instruction  word 


Digft.s  numbered  1  ihnm^h  6'S 


68-65 

64-61 

60-57     56-53  52-49 

48-45 

44-41     40-37  36-33 

32-29 

28-25     24-21  20-17 

16-13 

12-9  8-5 

4-1 

Tags 

Address  alpha 

Address  beta 

Address  gamma 

Next 
Instn. 

Code  for 
Operation 

Mon. 
Break 
Point 

000^ 

a- 
Digits 

b- 
Digits 

c- 

Digits 

d- 
Digits 

Param-  Basic 
eter  Type 

e 

Digits 

Addresses  alpha,  beta,  and  gamma  written  in  the  instniction 
word  are  subject  to  automatic  modification  if  desired  by  writing 
a  1-digit  in  a  specified  bit  position.  Such  addresses  are  called 
relative  addresses.  Each  of  the  three  addresses  (a,  /?,  and  7)  in  each 
instniction  word  contains  a  4-bit  code  group,  called  the  a-,  h-. 
and  c-digits  respectively,  in  which  anv  base  register  identification 
number  (0  through  15)  may  be  written.  When  this  is  done,  the 
address  number  to  which  the  computer  actuallv  refers  is  equal 
to  the  sum  (modulo  2^'')  of  the  address  number  stored  in  the 
designated  base  register  plus  an  address-modification  constant, 
indicated  in  the  remaining  12  bits  of  the  16-bit  address  segment 
of  the  instruction  word. 

Primary  storage  units 

Fast  access  memory.  Because  of  budget  limitations,  the  initial 
installation  of  the  system  will  contain  only  a  relatively  small 
section  of  internal  memory  of  the  diode-capacitor  type.  This 
diode-capacitor  memory,  originally  developed  at  NBS  in  1953,  is 
very  fast;  i.e.,  capable  of  providing  one  random  access  per  micro- 
second, but  it  has  the  disadvantage  of  relativelv  high  cost  per  word 
of  storage.  This  tvpe  of  memory  is  available  in  modules  of  256 
words  subdivided  as  follows: 

Numerical  information  64  bits 

Algebraic  signs  and  tags  4  bits 

Parity  check  digits  4  bits 

Total  word  length  72  bits 

The  over-all  svstem  is  designed  to  accommodate  up  to  32,768 
internally-accessible  full-words,  which  mav  be  held  in  storage  units 
with  access  times  ranging  from  1  microsecond  (/isec)  to  32  /xsec. 
Thus  the  minimum  fast  access  memory  can  be  backed  up  with 
a  much  larger  and  slower  magnetic-core  memory. 

Inter-memory  transfer  trunk.  Provision  is  made  for  transferring 
blocks  of  information  between  the  various  internal  storage  units 


in  the  system,  concurrently  with  computation.  The  size  of  the 
block  transferred  may  range  from  a  single  word  to  the  entire 
contents  of  the  memory,  and  the  addresses  between  which  the 
information  is  transferred  are  specified  bv  a  single  programmed 
inter-memory  transfer  instniction.  .\utoniatic  interlocks  are  pro- 
vided to  insure  that  all  future  references  which  the  program  may 
make  to  anv  memor\'  positions  involved  in  the  inter-memory 
transfer  operation  are  automatically  made  after  the  data  have  been 
shifted  to  the  new  locations. 

Secondary  computer 

.\rithmctic  and  processino  unit.  The  secondary  computer  is  a 
high-speed  independently  programmable  general-purpose  com- 
puter that  operates  in  conjunction  with  the  primary  computer  and 
can  perform  16  distinct  tvpes  of  operations  using  16-bit  words. 
These  operations  include  6  arithmetic-processing  operations,  4 
choice  operations,  1  nonnumerical  processing  operation,  and  5 
operations  that  transfer  digital  information  or  control-signals  be- 
tween the  primary  and  the  secondary  computers.  See  Table  2. 
Operation  times  for  the  secondary  computer  average  about  2  fisec. 

Both  computers  operate  concurrently  and  can  transfer  infor- 
mation back  and  forth  between  each  other.  One  of  the  principal 
functions  of  the  secondary  computer  is  to  carry  out  so-called 
"red-tape"  operations,  such  as:  (1)  counting  iterations,  (2)  syste- 
matically modifving  the  addresses  of  the  operands  and  instructions 
referred  to  by  the  primary  program,  (3)  monitoring  the  primary 
program,  and  (4)  various  special  tasks.  Through  the  use  of  special 
subroutines  for  the  secondary  computer,  both  computers  acting 
co-operatively  can  be  made  to  carry  out  a  wide  variety  of  complex 
operations  without  unduly  complicating  the  writing  of  the  primary 
computer  programs.  Examples  of  such  operations  are:  (1)  special 
types  of  sorting,  (2)  logarithmic  search,  (3)  routines  involving 
cross-referencing,  or  items  selected  according  to  an  attached  code, 
(4)  error  analyses,  and  (5)  operations  involving  small  numerical 
fields. 


Part  5     The  PMS  level 


Section  2     Computers  with  one  central  processor  and  multiple  input/output  processors 


Secondanj  storage  unit.  Associated  with  the  secondary  computer 
is  the  secondary  storage  unit  which  consists  of  60  storage  locations 
containing  16-bit  words.  Sixteen  of  these  locations  can  be  used 
as  base  registers  by  the  primary  computer  and  may  be  selected 
by  the  primary  computer  according  to  the  a-,  h-.  c-,  and  rf-digits 
in  the  primary  instruction  word.  The  contents  of  the  registers 
selected  by  the  primary  computer  in  this  way  are  automatically 
added  to  the  address  numbers  specified  in  the  primary  computer 
instruction  word.  The  secondary  storage  unit  is  also  capable  of 
being  addressed  directly  by  the  primary  computer.  The  fifteen 
4-word  blocks  of  the  secondary  storage  are  identified  by  15  special 
primary  address  numbers.  Other  addressable  registers  associated 
with  the  secondary  storage  hold  the  address  numbers  of  current 
and  next  instruction  words  in  the  primary  program. 

Program  control  unit.  The  secondary  computer  program  operates 
with  a  2-address  instruction  system,  the  addresses  referring  to 
words  in  the  secondary  storage  unit,  including  the  base  registers. 
See  Table  4.  From  time  to  time  the  primary  instruction  program 
may  order  the  insertion  of  a  new  instruction  into  the  secondary 
instruction  register  or  may  order  the  transfer  of  data  in  either 
direction  between  the  primary  storage  units  and  the  secondary 
storage  unit.  The  secondary  computer  program  may  also  cause  data 
to  be  transferred  into  the  secondary  storage  unit  from  the  primary 
instruction  register  and  can  also  cause  information  to  be  trans- 
ferred into  the  primary  instruction  register  from  a  location  in  the 
main  memory. 

Using  these  facilities,  the  secondary  computer  can  inspect  each 
instruction  word  in  the  primary  program  as  it  is  selected  from  the 
primary  store  and,  acting  upon  specifications  written  into  the 
secondary  program,  can  cause  the  primary  instruction  either  to 
be  executed  as  written  or  to  be  replaced  by  a  new  instruction  word 
from  a  memory  location  determined  by  the  secondary.  Other  types 
of  discrimination  can  be  effected  by  the  secondary  that  depend 
upon  the  result  of  a  primary  operation,  such  as  an  overflow,  jump, 
etc.  These  features  facilitate  the  use  of  interpretive  programming 
methods. 


Table  4    Contents  of  secondary  instruction  word 


Digifi  numbered  1  throuf}Ji  16 

16  ,13 

12  7 

6.1 

Operation  code 

(0-15) 

Address  "g" 

Address  "h" 

Input-output  control 

Concurrent  input-output  trunks.  The  concurrent  input-output 
tnmks  have  the  function  of  controlling  the  transfer  of  information 
in  either  direction  between  the  internal  memory  and  the  external 
storage  units.  All  input-output  transfers  are  initiated  by  a  single 
internally  programmed  instruction,  and  are  carried  out  by  the 
trunk  units  with  the  aid  of  automatic  interlocks  similar  to  those 
used  in  the  inter-memory  transfer  tr\mk  for  preventing  interfer- 
ence with  the  progress  of  the  computing  program.  The  size  of  the 
block  of  data  that  is  transferred  may  range  from  a  single  word 
to  the  entire  contents  of  the  memory  and  may  be  directed  to  any 
addresses.  Using  two  such  trunks,  it  is  possible  to  maintain  two 
continuous  streams  of  data  simultaneously  flowing  between  the 
internal  memory  and  any  two  external  storage  units  without 
interrvipting  the  progress  of  the  computations. 

Fonnat  controller.  Data  that  are  passing  in  and  out  of  the  internal 
storage  system  via  the  input-output  trunks  are  subject  to  further 
concurrent  processing  by  the  format  controller.  The  format  con- 
troller is  an  independent  internally-programmed  data-processing 
unit  specially  designed  for  carrying  out  general-purpose  editing, 
inspecting,  and  format-modifying  operations  on  incoming  or  out- 
going data.  Programs  for  the  format  controller  are  stored  on 
removable  plugboards,  and  the  primary  computer  program  is  able 
to  direct  the  format  controller  to  select  whichever  particular 
format  program  may  be  appropriate  from  among  the  small  library 
of  format  programs  contained  on  the  boards  currently  attached 
to  the  machine.  Among  the  typical  kinds  of  programs  that  the 
format  controller  can  carry  out  are:  (1)  searching  of  magnetic  tapes 
for  words  bearing  identifying  addresses  or  other  coded  labels 
specified  by  the  internal  program,  with  selective  input  or  output 
of  data  at  these  selected  tape  locations,  (2)  insertion  of  incoming 
data  for  the  internal  storage  units  of  the  system  into  address 
locations  specified  by  the  incoming  data  itself,  (3)  conversion  and 
rearrangement  of  data  that  are  stored  on  external  units  in  formats 
not  compatible  with  the  formats  used  in  the  internal  units;  e.g., 
binary-decimal  character  conversion,  adjustment  of  word-length 
modules,  etc. 

External  storage 

External  storage  in  the  initial  installation  of  the  system  will  consist 
mainly  of  magnetic  tape  units.  Because  of  the  flexibility  of  the 
format  controller,  it  will  be  possible  to  supplement  these  tape  units 
later  with  a  wide  variety  of  other  types  of  external  units  without 
making  any  significant  changes  in  the  existing  equipment. 


Chapter  35  ,  PILOT,  the  NBS  multicomputer  system 


Input-output  units 

The  system  is  designed  to  operate  with  a  wide  variety  of  input- 
output  devices,  both  digital  and  analog. 

Input  readers  and  printers.  Flexowriter  units  and  paper-tape  read- 
ers and  punches  will  be  available  in  the  initial  installation. 
Punched  card  input  readers  and  high-speed  printers,  along  with 
their  au.xiliarv  controls,  mav  be  attached  to  the  format  controller 
in  the  manner  indicated  in  the  preceding  paragraph. 

Displays.  Two  types  of  displays  are  provided  for:  (1)  pilot-light 
display  of  data  and  control  information  in  the  various  registers 
and  flip-flops  throughout  the  system,  in  order  to  aid  the  rapid 
diagnosis  of  equipment  malfimctions  of  programming  faults,  and 
(2)  picture-tube  display  of  real-time  data  stored  in  the  internal 
memory  of  the  system.  This  kinematic  diagram  type  of  display 
is  very  important  when  performing  dynamic  simulation  operations 
which  require  visual  presentation  of  the  simulated  data  in  real- 
time to  the  human  operators. 

External  control 

Manual-monitor  control.  The  term  "manual-monitor"  was  coined 
at  NBS  several  years  ago  to  describe  certain  types  of  control 
operations  that  are  initiated  either  manually  by  the  machine 
operator  or  bv  the  machine  itself  under  conditions  which  are 
specified  by  means  of  e.xternal  switch  settings.  The  former  is 
referred  to  as  a  manual  operation  and  the  latter  is  called  a  monitor 
operation  because  the  machine  must  monitor  its  internal  program 
to  determine  precisely  when  the  operation  should  be  performed. 
The  type  of  operation  to  be  performed  as  well  as  the  conditions 
imder  which  it  is  to  be  performed  are  specified  by  means  of 
external  switch  settings. 

This  feature  provides  for  convenient  communication  between 


the  data-processor  and  the  operator,  and  allows  the  operator  to 
monitor  the  progress  of  the  program  automatically,  to  insert  new 
data  and  instructions,  and  to  withdraw  intermediate  results  con- 
veniently, without  need  for  advance  preparation  of  special  pro- 
grams. This  is  particularly  useful  in  debugging  programs  and  in 
checking  equipment  malfimctions. 

Monitor  operations  are  performed  by  the  machine  whenever 
the  conditions  specified  by  the  external  switch  settings  occur  in 
the  course  of  the  program;  e.g.,  every  time  the  program  refers  to 
a  new  instruction,  any  time  the  program  refers  to  an  instruction 
to  which  a  special  monitor  breakpoint  symbol  (e-digits)  is  attached, 
any  time  an  arithmetic  overflow  occurs,  etc.  Bv  pairing  a  particular 
type  of  manual-monitor  operation  with  a  selected  set  of  conditions, 
a  variety  of  special  composite  operations  can  be  performed. 

Remote  controls.  Manual-monitor  operations  can  be  specified  and 
initiated  by  external  devices  as  well  as  by  human  operators.  Since 
all  of  the  external  switch  settings  control  onK'  d-c  voltages,  the 
external  devices  can  even  be  remote  from  the  machine  itself,  and 
from  a  distance,  via  ordinar\'  electrical  transmission  lines,  they  can 
exercise  supervisory  control  over  the  internal  program  of  the 
machine.  This  makes  it  possible  to  harness  together  two  or  more 
remotely  located  data-processing  machines,  and  have  them  work 
together  co-operatively  on  a  common  task.  Each  member  of  such 
an  interconnected  network  of  separate  data  processors  is  free  at 
any  time  to  initiate  and  dispatch  special  control  orders  to  any  of 
its  partners  in  the  system.  .\s  a  consequence,  the  supervisory 
control  over  the  common  task  may  be  shared  among  the  various 
members  of  the  system,  and  may  be  passed  back  and  forth  from 
one  machine  to  the  other  as  the  need  arises. 

References 

Lein.\5T.  59 


Section  3 

Computers  for  multiprocessing 
and  parallel  processing 

The  computers  in  this  section  are  probably  the  most  general 
in  the  book.  Although  the  general  PMS  model  for  a  computer 
in  Chap.  3,  page  65,  characterizes  these  computers,  the  struc- 
ture by  Lehman  (Chap.  37)  most  closely  fits  the  model.  The 
Burroughs  computers  that  are  presented  have  multiple  Pc's;^ 
however,  K's  are  used  for  control  of  device  K's,  rather  than 
Pio's— perhaps  a  wise  choice. 

0825— a  multiple-computer  system  for  command  and  control 

The  Burroughs  D825  computer  is  discussed,  together  with  other 
stack  processors,  in  Part  3,  Sec.  5,  page  257.  Chapter  36 
emphasizes  the  PMS  structure  and  operating  system  charac- 
teristics necessary  in  a  multiprocessor  system. 

'As  does  the  B  8500,  a  successor  to  the  D825;  however,  its  successor,  the 
B  8501,  IS  designed  with  Pio's. 


Design  of  the  B  5000  system 

This  computer  (Chap.  22)  is  discussed,  together  with  other  stack 
processors,  in  Part  3,  Sec.  5,  page  257. 

A  survey  of  problems  and  preliminary  results  concerning 
parallel  processing  and  parallel  processors 

Chapter  37,  by  M.  Lehman,  provides  a  very  good  introduction 
to  the  concepts  of  multiprogramming,  multiprocessing,  and 
parallel  processing.  A  specific  multiprocessor  computer  struc- 
ture is  postulated  to  provide  parallel  processing.  The  processing 
ability  of  the  structure  is  analyzed  at  the  instruction  level.  It 
is  significant  that  the  paper  is  by  an  IBM  scientist.  IBM  has 
not  been  particularly  advanced  in  the  use  of  multiple  arithmetic 
processor  computers. 


Chapter  36 


D825— a  multiple-computer  system 
for  command  and  control^ 

James  P.  Anderson  /  Samuel  A.  Hoffman 
Joseph  Shifman  /  Robert  J.  WiUianhs 

Introduction 

The  D825  Modular  Data  Processing  System  is  the  result  of  a 
Burroughs  study,  initiated  several  years  ago,  of  the  data  processing 
requirements  for  command  and  control  systems.  The  D825  has 
been  developed  for  operation  in  the  military  environment.  The 
initial  system,  constructed  for  the  Naval  Research  Laboratory  with 
the  designation  AN/GYK-3(V),  has  been  completed  and  tested. 
This  paper  reviews  the  design  criteria  analysis  and  design  rationale 
that  led  to  the  system  stnicture  of  the  D825.  The  implementation 
and  operation  of  the  system  are  also  described.  Of  particular 
interest  is  the  role  that  developed  for  an  operating  system  program 
in  coordinating  the  system  components. 

Functional  requirements  of  command  and  control  data  processing 

By  "command  and  control  system  "  is  meant  a  system  having  the 
capacity  to  monitor  and  direct  all  aspects  of  the  operation  of  a 
large  man  and  machine  complex.  Until  now,  the  term  has  been 
applied  exclusively  to  certain  military  complexes,  but  could  as  well 
be  applied  to  a  fully  integrated  air  traffic  control  system  or  even 
to  the  operation  of  a  large  industrial  complex.  Operation  of  com- 
mand and  control  systems  is  characterized  bv  an  enormous  quan- 
tity of  diverse  but  interrelated  tasks — generally  arising  in  real 
time — which  are  best  performed  by  automatic  data-processing 
equipment,  and  are  most  effectively  controlled  in  a  fully  integrated 
central  data  processing  facility.  The  data  processing  fimctions 
alluded  to  are  those  typical  of  data  processing,  plus  special  fimc- 
tions associated  with  servicing  displays,  responding  to  manual 
in.sertion  (through  consoles)  of  data,  and  dealing  with  communica- 
tions facilities.  The  design  implications  of  these  functions  will  be 
considered  here. 

Avaihibilitii  criteria.  The  primary  requirement  of  the  data-proc- 
essing facility,  above  all  else,  is  availability.  This  requirement, 
essentially  a  fimction  of  hardware  reliability  and  maintainability, 

K\F}PS  Proc.  FJCC.  vol.  22,  pp.  86-96,  1962. 


is,  to  the  user,  simply  the  percentage  of  available,  on-line,  opera- 
tion time  during  a  given  time  period.  Every  system  designer  must 
trade  off  the  costs  of  designing  for  reliability  against  those  incurred 
by  unavailability,  but  in  no  other  application  are  the  costs  of 
unavailability  so  high  as  those  presented  in  command  and  control. 
Not  only  is  the  requirement  for  hardware  reliability  greater  than 
that  of  commercial  systems,  but  downtime  for  the  complete  system 
for  preventive  maintenance  cannot  be  permitted.  Depending  upon 
the  application,  some  greater  or  lesser  portion  of  the  complete 
system  must  always  be  available  for  primary  system  fimctions,  and 
all  of  the  .system  must  be  available  most  of  the  time. 

The  data  processing  facility  may  also  be  called  upon,  except 
at  the  most  critical  times,  to  take  part  in  exercising  and  evaluating 
the  operation  of  some  parts  of  the  system,  or,  in  fact,  in  actual 
simulation  of  system  functions.  During  such  exercises  and  simula- 
tions, the  system  must  maintain  some  (although  perhaps  partially 
and  temporarily  degraded)  real-life  and  real-time  capability,  and 
must  be  able  to  return  quickly  to  full  operation.  ,\n  implication 
here,  of  profound  significance  in  system  design,  is,  again,  the 
requirement  that  most  of  the  system  be  always  available;  there 
must  be  no  system  elements  (unsupported  by  alternates)  perform- 
ing functions  so  critical  that  failure  at  these  points  could  compro- 
mise the  primary  system  functions. 

Adaptability  criteria,  .\nother  requirement,  equally  difficult  to 
achieve,  is  that  the  computer  system  must  be  able  to  analyze  the 
demands  being  made  upon  it  at  any  given  time,  and  determine 
from  this  analysis  the  attention  and  emphasis  that  should  be  given 
to  the  individual  tasks  of  the  problem  mix  presented.  The  working 
configuration  of  the  system  must  be  completely  adaptable  so  as 
to  accommodate  the  diverse  problem  mi.xes,  and,  moreover,  must 
respond  quickly  to  important  changes,  such  as  might  be  indicated 
by  external  alarms  or  the  results  of  internal  computations  (exceed- 
ing of  certain  thresholds,  for  e.xample),  or  to  changes  in  the  hard- 
ware configuration  resulting  from  the  failure  of  a  system  compo- 
nent or  from  its  intentional  removal  from  the  system.  The  system 


447 


Part  5  I  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


must  have  the  ability  to  be  dynamically  and  automatically  re- 
stnictured  to  a  working  configuration  that  is  responsive  to  the 
problem-mix  environment. 

Expansibility  criteria.  The  requirement  of  expansibility  is  not 
unique  to  command  and  control,  but  is  a  desirable  feature  in  any 
application  of  data  processing  equipment.  However,  the  need  for 
expansibility  is  more  acute  in  command  and  control  because  of 
the  dependence  of  much  of  the  efficacy  of  the  system  upon  an 
ability  to  meet  the  changing  requirements  brought  on  by  the  very 
rapidly  changing  technology  of  warfare.  Further,  it  must  be  possi- 
ble to  incorporate  new  functions  in  such  a  way  that  little  or  no 
transitional  downtime  results  in  any  hardware  area. 

Expansion  should  be  possible  without  incurring  the  costs  of 
providing  more  capability  than  is  needed  at  the  time.  This  ability 
of  the  system  to  grow  to  meet  demands  should  apply  not  only  to 
the  conventionally  expansible  areas  of  memory  and  I/O  but  to 
computational  devices,  as  well. 

Programming  criteria.  Expansion  of  the  data-processing  facility 
should  require  no  reprogramming  of  old  functions,  and  programs 
for  new  functions  should  be  easily  incorporated  into  the  overall 
system.  To  achieve  this  capability,  programs  must  be  written  in 
a  manner  which  is  independent  of  system  configuration  or  problem 
mix,  and  should  even  be  interchangeable  between  sites  performing 
like  tasks  in  different  geographic  locales.  Finally,  because  of  the 
large  volume  of  routines  that  must  be  written  for  a  command  and 
control  system,  it  should  be  possible  for  many  different  people, 
in  different  locations  and  of  different  areas  of  responsibility,  to 
write  portions  of  programs,  and  for  the  programs  to  be  subse- 
quently linked  together  by  a  suitable  operating  system. 

Concomitant  with  the  latter  requirement  and  with  that  of 
configuration-independent  programs  is  the  desirability  of  orienting 
system  design  and  operation  toward  the  use  of  a  high-level  pro- 
cedure-oriented language.  The  language  should  have  the  features 
of  the  usual  algorithmic  languages  for  scientific  computations,  but 
should  also  include  provisions  for  maintaining  large  files  of  data 
sets  which  may,  in  fact,  be  ill-structured.  It  is  also  desirable  that 
the  language  reflect  the  special  nature  of  the  application;  this  is 
especially  true  when  the  language  is  used  to  direct  the  storage 
and  retrieval  of  data. 

Design  rationale  for  the  data-processing  facility 

The  three  requirements  of  availability,  adaptability,  and  expansi- 
bility were  the  motivating  considerations  in  developing  the  D825 
design.  In  arriving  at  the  final  systems  design,  several  existing  and 


proposed  schemes  for  the  organization  of  data  processing  systems 
were  evaluated  in  light  of  the  requirements  listed  above.  Many 
of  the  same  conclusions  regarding  these  and  other  schemes  in  the 
use  of  computers  in  command  and  control  were  reached  inde- 
pendently in  a  more  recent  study  conducted  for  the  Department 
of  Defense  by  the  Institute  for  Defense  Analysis  [Kroger  et  al., 
1961]. 

The  single-computer  system.  The  most  obvious  system  scheme,  and 
the  least  acceptable  for  command  and  control,  is  the  single-com- 
puter system.  This  scheme  fails  to  meet  the  availability  require- 
ment simply  because  the  failure  of  any  part — computer,  memory, 
or  I/O  control — disables  the  entire  system.  Such  a  system  was  not 
given  serious  consideration. 

Replicated  single-computer  systems.  A  system  organization  that  had 
been  well  known  at  the  time  these  considerations  were  active 
involves  the  duplication  (or  triplication,  etc.)  of  single-computer 
systems  to  obtain  availability  and  greater  processing  rates.  This 
approach  appears  initially  attractive,  inasmuch  as  programs  for 
the  application  may  be  split  among  two  or  more  independent 
single-computer  systems,  using  as  many  such  systems  as  needed 
to  perform  all  of  the  required  computation.  Even  the  availability 
requirement  seems  satisfied,  since  a  redundant  system  may  be  kept 
in  idle  reserve  as  backup  for  the  main  function. 

On  closer  examination,  however,  it  was  perceived  that  such 
a  system  had  many  disadvantages  for  command  and  control  appli- 
cations. Besides  requiring  considerable  human  effort  to  coordinate 
the  operation  of  the  systems,  and  considerable  waste  of  available 
machine  time,  the  replicated  single  computers  were  found  to  be 
ineffective  because  of  the  highly  interrelated  way  in  which  data 
and  programs  are  frequently  used  in  command  and  control  appli- 
cations. Further,  the  steps  necessary  to  have  the  redundant  or 
backup  system  take  over  the  main  function,  should  the  need  arise, 
would  prove  too  cumbersome,  particularly  in  a  time-critical  ap- 
plication where  constant  monitoring  of  events  is  required. 

Partially  shared  memory  schemes.  It  was  seen  that  if  the  replicated 
computer  scheme  were  to  be  modified  by  the  use  of  partially 
shared  memory,  some  important  new  capabilities  would  arise.  A 
partially  shared  memory  can  take  several  forms,  but  provides 
principally  for  some  shared  storage  and  some  storage  privately 
allotted  to  individual  computers.  The  .shared  storage  may  be  of 
any  kind — tapes,  discs,  or  core — but  frequently  is  core.  Such  a 
system,  by  providing  a  direct  path  of  communication  between 
computers,  goes  a  long  way  toward  satisfying  the  requirements 
listed  above. 


Chapter  36  |  D825— a  multiple-computer  system  for  command  and  control  449 


The  one  advantage  to  be  found  in  having  some  memory  private 
to  each  computer  is  that  of  data  protection.  This  advantage  van- 
ishes when  it  is  necessary  to  exchange  data  between  computers, 
for  if  a  computer  failure  were  to  occur,  the  contents  of  the  private 
memory  of  that  computer  would  be  lost  to  the  system.  Further- 
more, many  tasks  in  the  command  and  control  application  require 
access  to  the  same  data.  If,  for  example,  it  would  be  desirable  to 
permit  some  privately  stored  data  to  be  made  available  to  the  fully 
shared  memory  or  to  some  other  private  memory,  considerable 
time  would  be  lost  in  transferring  the  data.  It  is  also  clear  that 
a  certain  amount  of  utilization  efficiency  is  lost,  since  some  private 
memory  may  be  unused,  while  another  computer  may  require 
more  memory  than  is  directly  available,  and  may  be  forced  to 
transfer  other  blocks  of  data  back  to  bulk  storage  to  make  way 
for  the  necessary  storage.  It  might  be  added  in  passing  that  if 
private  I/O  complements  are  considered,  the  same  questions  of 
decreased  overall  availabilitv  and  decreased  efficiency  arise. 

Master/slave  schemes.  Another  aspect  of  the  partially  shared 
memory  system  is  that  of  control.  A  number  of  such  systems 
employ  a  master/slave  scheme  to  achieve  control,  a  technique 
wherein  one  computer,  designated  the  master  computer,  coordi- 
nates the  work  done  by  the  others.  The  master  computer  might 
be  of  a  different  character  than  the  others,  as  in  the  PILOT  system, 
developed  by  the  National  Bureau  of  Standards  [Leiner  et  al., 
1957],  or  it  may  be  of  the  same  basic  design,  differing  only  in  its 
prescribed  role,  as  in  the  Thompson  Ramo  Wooldridge  TRW4()0 
(AN/FSQ-27)  [Porter,  I960].  Such  a  scheme  does  recognize  the 
importance,  for  multicomputer  systems,  of  the  problem  of  coordi- 
nating the  processing  effort;  the  master  computer  is  an  effective 
means  of  accomplishing  the  coordination.  However,  there  are 
several  difficulties  in  such  a  design.  The  loss  of  the  master  com- 
puter would  down  the  whole  system,  and  the  command  and  control 
availability  requirement  could  not,  consequently,  be  met.  If  this 
weakness  is  countered  by  providing  the  ability  for  the  master 
control  fimction  to  be  automatically  switched  to  another  processor, 
there  still  remains  an  inherent  inefficiency.  If,  for  example,  the 
workload  of  the  master  computer  becomes  very  large,  the  master 
becomes  a  system  bottleneck  resulting  in  inefficient  use  of  all  other 
system  elements;  and,  on  the  other  hand,  if  the  workload  fails  to 
keep  the  master  busy,  a  waste  of  computing  power  results.  The 
conclusion  is  then  reached  that  a  master  should  be  established  only 
when  needed;  this  is  what  has  been  done  in  the  design  of  the  D825. 

The  totally  modular  scheme.  As  a  result  of  these  analyses,  certain 
implications  became  clear.  The  availability  requirement  dictated 


a  decentralization  of  the  computing  fimction — that  is,  a  multi- 
plicity of  computing  units.  However,  the  nature  of  the  problem 
required  that  data  be  freely  communicable  among  these  several 
computers.  It  was  decided,  therefore,  that  the  memory  system 
would  be  completely  shared  bv  all  processors.  And,  from  the  point 
of  view  of  availability  and  efficiency,  it  was  also  seen  to  be  unde- 
sirable to  a.ssociate  I  ()  with  a  particular  computer;  the  I/O 
control  was,  therefore,  also  decoupled  from  the  computers. 

Furthermore,  a  svstem  with  several  computers,  totally  shared 
memory,  and  decoupled  I/O  seemed  a  perfect  structure  for  satis- 
fying the  adaptability  requirements  of  command  and  control.  Such 
a  structure  resulted  in  a  flexibility  of  control  which  was  a  fine 
match  for  the  dynamic,  highly  variable,  processing  requirements 
to  be  encountered. 

The  major  problem  remaining  to  realize  the  computational 
potential  represented  by  such  a  system  was,  of  course,  that  of 
coordinating  the  many  system  elements  to  behave,  at  any  given 
time,  like  a  system  specifically  designed  to  handle  the  set  of  tasks 
with  which  it  was  faced  at  that  time.  Because  of  the  limitations 
of  previously  available  equipment,  an  operating  system  program 
had  always  been  identified  with  the  equipment  running  the  pro- 
gram. However,  in  the  proposed  design,  the  entire  memory  was 
to  be  directly  accessible  to  all  computer  modules,  and  the  operat- 
ing system  could,  therefore,  be  decoupled  from  any  specific  com- 
puter. The  operation  of  the  system  could  be  coordinated  by  having 
any  processor  in  the  complement  nui  the  operating  system  only 
as  the  need  arose.  It  became  clear  that  the  master  computer  had 
actuallv  become  a  program  stored  in  totally  shared  memory,  a 
transformation  which  was  also  seen  to  offer  enhanced  program- 
ming fle.xibilitv. 

Up  to  this  point,  the  need  for  identical  computer  modules  had 
not  been  established.  The  equality  of  responsibility  among  com- 
puting units,  which  allowed  each  computer  to  perform  as  the 
master  when  running  the  operating  system,  led  finally  to  the  design 
specification  of  identical  computer  modules.  These  were  freely 
interconnected  to  a  set  of  identical  memory  modules  and  a  set 
of  identical  I/O  control  modules,  the  latter,  in  turn,  freely  inter- 
connected to  a  highlv  variable  and  diverse  I/O  device  comple- 
ment. It  was  clear  that  the  complete  modularity  of  system  ele- 
ments was  an  effective  solution  to  the  problem  of  expansibility, 
inasmuch  as  expansion  could  be  accomplished  simply  by  adding 
modules  identical  to  those  in  the  existing  complement.  It  was  also 
clear  that  important  advantages  and  economies  resulting  from  the 
manufacture,  maintenance,  and  spare  parts  provisioning  for  iden- 
tical modules  also  accrue  to  such  a  system.  Perhaps  the  most 
important  result  of  a  totally  modular  organization  is  that  redun- 


Part  5  I  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


dancy  of  the  required  complement  of  any  module  type,  for  greater 
reliability,  is  easily  achieved  by  incorporating  as  little  as  one 
additional  module  of  that  type  in  the  system.  Furthermore,  the 
additional  module  of  each  type  need  not  be  idle;  the  system  may 
be  looked  upon  as  operating  with  active  spares. 

Thus,  a  design  structure  based  upon  complete  modularity  was 
set.  Two  items  remained  to  weld  the  various  fimctional  modules 
into  a  coordinated  system — a  device  to  electronically  interconnect 
the  modules,  and  an  operating  system  program  with  the  effect  of 
a  master  computer,  to  coordinate  the  activities  of  the  modules  into 
fully  integrated  system  operation. 

In  the  D825,  these  two  tasks  are  carried  out  by  the  switching 
interlock  and  the  Automatic  Operating  and  Scheduling  Program 
(AOSP),  respectively.  Figure  1  shows  how  the  various  fimctional 
modules  are  interconnected  via  the  interlock  in  a  matrix-like 
fashion. 

System  implementation 

Most  important  in  the  design  implementation  of  the  D825  were 
studies  toward  practical  realization  of  the  switching  interlock  and 
the  AOSP.  The  computer,  memory,  and  I/O  control  modules 
permitted  more  conventional  solutions,  but  were  each  to  incor- 
porate some  unusual  features,  while  many  of  the  I/O  devices  were 
selected  from  existing  equipment.  With  the  exception  of  the  latter, 
all  of  theses  elements  are  discussed  here  briefly.  (A  summary  of 
D825  characteristics  and  specifications  is  included  at  the  end  of 
the  paper.) 

Sioitching  interlock.  Having  determined  that  only  a  completely 
shared  memory  system  would  be  adequate,  it  was  necessary  to  find 
some  way  to  permit  access  to  any  memory  by  any  processor,  and, 
in  fact,  to  permit  sharing  of  a  memorv  module  by  two  or  more 
processors  or  I/O  control  modules. 

A  fimction  distributed  physically  through  all  of  the  modules 
of  a  D825  system,  but  which  has  been  designated  in  aggregate 
the  switching  interlock,  effects  electronically  each  of  the  many 
brief  interconnections  by  which  all  information  is  transferred 
among  computer,  memory,  and  I/O  control  modules.  In  addition 
to  the  electronic  switching  fimction,  the  switching  interlock  has 
the  ability  to  detect  and  resolve  conflicts  such  as  occur  when  two 
or  more  computer  modules  attempt  access  to  the  same  memory 
module. 

The  switching  interlock  consists  functionally  of  a  crosspoint 
switch  matrix  which  effects  the  actual  switching  of  bus  intercon- 
nections, and  a  bus  allocator  which  resolves  all  time  conflicts 
resulting  from  simultaneous  requests  for  access  to  the  same  bus 


or  system  module.  Conflicting  requests  are  queued  up  according 
to  the  priority  assigned  to  the  requestors.  Priorities  are  pre- 
emptive in  that  the  appearance  of  a  higher  priority  request  will 
cause  service  of  that  request  before  service  of  a  lower  priority 
request  already  in  the  queue.  Analyses  of  queueing  probabilities 
have  shown  that  queues  longer  than  one  are  extremely  unlikely. 

The  priority  scheduling  fimction  is  performed  by  the  bus  allo- 
cator, essentially  a  set  of  logical  matrices.  The  conflict  matrix 
detects  the  presence  of  conflicts  in  requests  for  interconnection. 
The  priority  matrix  resolves  the  priority  of  each  request.  The 
logical  product  of  the  states  of  the  conflict  and  priority  matrices 
determines  the  state  of  the  queue  matrix,  which  in  turn  governs 
the  setting  of  the  crosspoint  switch,  unless  the  requested  module 
is  busy. 

The  AOSP:  an  operating  si/stem  program.  The  AOSP  is  an  operating 
system  program  stored  in  totally  shared  memory  and  therefore 
available  to  any  computer.  The  program  is  run  only  as  needed 
to  exert  control  over  the  system.  The  AOSP  includes  its  own 
executive  routine,  an  operating  system  for  an  operating  system, 
as  it  were,  calling  out  additional  routines,  as  required.  The  con- 
figuration of  the  AOSP  thus  permits  variation  from  application  to 
application,  both  in  sequence  and  quantity  of  available  routines 
and  in  disposition  of  AOSP  storage. 

The  AOSP  operates  effectively  on  two  levels,  one  for  system 
control,  the  other  for  task  processing. 

The  system  control  function  embodies  all  that  is  necessary  to 
call  system  programs  and  associated  data  from  some  location  in 
the  I/O  complement,  and  to  ready  the  programs  for  execution  by 
finding  and  allocating  space  in  memory,  and  initiating  the  proc- 
essing. Most  of  the  system  control  fimction  (as  well  as  the  task 
processing  function)  consists  of  elaborate  bookkeeping  for:  pro- 
grams being  run,  programs  that  are  active  (that  is,  occupy  memory 
space),  I/O  commands  being  executed,  other  I/O  commands 
waiting,  external  data  blocks  to  be  received  and  decoded,  and 
activation  of  the  appropriate  programs  to  handle  such  external 
data.  It  would  be  inappropriate  here  to  discuss  the  myriad  details 
of  the  AOSP;  some  idea  of  its  scope,  however,  can  be  obtained 
from  the  following  list  of  some  of  its  major  functions: 

1  Configuration  determination 

2  Memory  allocation 

3  Scheduling 

4  Program  readying  and  end-of-job  cleanup 

5  Reporting  and  logging 


Chapter  36     D825— a  multiple-computer  system  for  command  and  control 


Fig.  1.  System  organization,  Burroughs  D825  modular  data  processing  system. 


Part  5  I  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


6  Diagnostics  and  confidence  checking 

7  External  interrupt  processing 

The  task  processing  function  of  the  AOSP  is  to  execute  all 
program  I/O  requests  in  order  to  centralize  scheduling  problems 
and  to  protect  the  system  from  the  possibility  of  data  destruction 
by  ill-structured  or  conflicting  programs. 

AOSP  response  to  interrupts.  The  AOSP  fimction  depends  heavily 
upon  the  comprehensive  set  of  internipts  incorporated  in  the 
D825,  All  interrupt  conditions  are  transmitted  to  all  computer 
modules  in  the  system,  and  each  computer  module  can  respond 
to  all  interrupt  conditions.  However,  to  make  it  possible  to  dis- 
tribute the  responsibility  for  various  interrupt  conditions,  both 
system  and  local,  each  computer  module  has  an  interrupt  mask 
register  that  controls  the  setting  of  individual  bits  of  the  interrupt 
register.  The  occurrence  of  any  interrupt  causes  one  of  the  system 
computer  modules  to  leave  the  program  it  has  been  running  and 
branch  to  the  suitable  AOSP  entry,  entering  a  control  mode  as  it 
branches.  The  control  mode  differs  from  the  normal  mode  of 
operation  in  that  it  locks  out  the  response  to  some  low-priority 
interrupts  (although  recording  them)  and  enables  the  execution 
of  some  additional  instructions  reserved  for  AOSP  use  (such  as 
setting  an  interrupt  mask  register  or  memory  protection  registers, 
or  transmitting  an  I/O  instruction  to  an  I/O  control  module). 

In  responding  to  an  internipt,  the  AOSP  transfers  control  to 
the  appropriate  routine  handling  the  condition  designated  by  the 
interrupt.  When  the  interrupt  condition  has  been  satisfied,  control 
is  returned  to  the  original  object  program.  Interrupts  caused  by 
normal  operating  conditions  include: 

1  16  difi^erent  types  of  external  requests 

2  Completion  of  an  I/O  operation 

3  Real-time  clock  overflow 

4  Array  data  absent 

5  Computer-to-coniputer  interrupts 

6  Control  mode  entry  (normal  mode  halt) 

Intermpts  related  to  abnormalities  of  either  program  or  equipment 
include: 

1  Attempt  by  program  to  write  out  of  bounds 

2  Arithmetic  overflow 

3  Illegal  instruction 


4  Inability  to  access  memory,  or  an  internal  parity  error; 
parity  error  on  an  I/O  operation  causes  termination  of  that 
operation  with  suitable  indication  to  the  AOSP 

5  Primary  power  failure 

6  Automatic  restart  after  primary  power  failure 

7  I/O  termination  other  than  normal  completion 

While  the  reasons  for  including  most  of  the  intermpts  listed  above 
are  evident,  a  word  of  comment  on  some  of  them  is  in  order. 

The  array-data-absent  interrupt  is  initiated  when  a  reference 
is  made  to  data  that  is  not  present  in  the  memory.  Since  all  array 
references  such  as  A[k]  are  made  relative  to  the  base  (location 
of  the  first  element)  of  the  array,  it  is  necessary  to  obtain  this 
address  and  to  index  it  by  the  value  k.  When  the  base  of  array 
A  is  fetched,  hardware  sensing  of  a  presence  bit  either  allows  the 
operation  to  continue,  or  initiates  the  array-data-absent  interrupt. 
In  this  way,  keeping  track  of  data  in  use  by  Interacting  programs 
can  be  simplified,  as  may  the  storage  allocation  problem. 

The  primary  power  failure  interrupt  is  highest  priority,  and 
always  pre-emptive.  This  interrupt  causes  all  computer  and  I/O 
control  modules  to  terminate  operations,  and  to  store  all  volatile 
information  either  in  memory  modules  or  in  magnetic  thin-film 
registers.  (The  latter  are  integral  elements  of  computer  modules.) 
This  interrupt  protects  the  system  from  transient  power  failure, 
and  is  initiated  when  the  primary  power  source  voltage  drops 
below  a  predetermined  limit. 

The  automatic  restart  after  primary  power  failure  interrupt  is 
provided  so  that  the  previous  state  of  the  system  can  be  recon- 
structed. 

A  description  of  how  an  external  interrupt  is  handled  might 
clarify  the  general  interrupt  procedure.  Upon  the  presence  of  an 
external  interrupt,  the  computer  which  has  been  assigned  respon- 
sibility to  handle  such  interrupts  automatically  stores  the  contents 
of  those  registers  (such  as  the  program  counter)  necessary  to 
subsequently  reconstitute  its  state,  enters  the  control  mode,  and 
goes  to  a  standard  (hardware-determined)  location  where  a  branch 
to  the  external  request  routine  is  located.  This  routine  has  the 
responsibility  of  determining  which  external  request  line  requires 
servicing,  and,  after  consulting  a  table  of  external  devices  (teletype 
buffers,  console  keyboards,  displays,  etc.)  associated  with  the 
interrupt  lines,  the  computer  constructs  and  transmits  an  input 
instruction  to  the  requesting  device  for  an  initial  message.  The 
computer  then  makes  an  entry  in  the  table  of  the  I/O  complete 
program  (the  program  that  handles  I/O  complete  internipts)  to 
activate  the  appropriate  responding  routine  when  the  message  is 


Chapter  36  |  D825— a  multiple  computer  system  for  command  and  control  453 


read  in.  A  check  is  then  made  for  the  occurrence  of  additional 
external  requests.  Finally,  the  computer  restores  the  saved  register 
contents  and  returns  in  normal  mode  to  the  internipted  program. 

AOSP  control  of  I/O  activity.  As  mentioned  above,  control  of  all 
I/O  activity  is  also  within  the  province  of  the  AOSP.  Records  are 
kept  on  the  condition  and  availability  of  each  I/O  device.  The 
locations  of  all  files  within  the  computer  system,  whether  on 
magnetic  tape,  dmm,  disc  file,  card,  or  represented  as  external 
inputs,  are  also  recorded.  A  request  for  input  by  file  name  is 
evaluated,  and,  if  the  device  associated  with  this  name  is  readily 
available,  the  action  is  initiated.  If  for  any  reason  the  request  must 
be  deferred,  it  is  placed  in  a  program  queue  to  await  conditions 
which  permit  its  initiation.  Typical  conditions  which  would  cause 
deferral  of  an  I/O  operation  include: 

1  No  available  I/O  control  module  or  channel. 

2  The  device  in  which  the  file  is  located  is  presently  in  use. 

3  The  file  does  not  exist  in  the  system. 

In  the  latter  case,  typically,  a  message  would  be  typed  out  on  the 
supervisory  printer,  asking  for  the  missing  file. 

The  I/O  complete  interrupt  signals  the  completion  of  each  I/O 
operation.  Along  with  this  interrupt,  an  I/O  result  descriptor  is 
deposited  in  an  .^OSP  table.  The  status  relayed  in  this  descriptor 
indicates  whether  or  not  the  operation  was  successful.  If  not 
successful,  what  went  wrong  (such  as  a  parity  error,  or  tape  break, 
card  jams,  etc.)  is  indicated  so  that  the  AOSP  may  initiate  the 
appropriate  action.  If  the  operation  was  successful,  any  waiting 
I/O  operations  which  can  now  proceed  are  initiated. 

AOSP  control  of  program  •scheduling.  Scheduling  in  the  D825  relies 
upon  a  job  table  maintained  by  the  AOSP.  Each  entry  is  identified 
with  a  name,  priority,  precedence  requirements,  and  equipment 
requirements.  Priority  may  be  dynamic,  depending  upon  time, 
external  requests,  other  programs,  or  a  function  of  many  variable 
conditions.  Each  time  the  AOSP  is  called  upon  to  select  a  program 
to  be  nm,  whether  as  a  result  of  the  completion  of  a  program  or 
of  some  other  interrupt  condition,  the  job  table  is  evaluated.  In 
a  real-time  system,  situations  occur  wherein  there  is  no  system 
program  to  be  run.  and  machine  time  is  available  for  other  uses. 
This  time  could  be  used  for  auxiliary  functions,  such  as  confidence 
routines. 

The  AOSP  provides  the  capability  for  program  segmentation 
at  the  discretion  of  the  programmer.  Control  macros  embedded 


in  the  program  code  inform  the  .\OSP  that  parallel  processing  with 
two  or  more  computers  is  possible  at  a  given  point.  In  addition, 
the  programmer  must  specify  where  the  branches  indicated  in  this 
manner  will  join  following  the  parallel  processing. 

Computer  module.  The  computer  modules  of  the  D825  system  are 
identical,  general-purpose,  arithmetic  and  control  units.  In  deter- 
mining the  internal  structure  of  the  computer  modules,  two  con- 
siderations were  uppermost.  First,  all  programs  and  data  had  to 
be  arbitrarily  relocatable  to  simplify  the  storage  allocation  fimc- 
tion  of  the  .AOSP:  secondly,  programs  would  not  be  modified 
during  execution.  The  latter  consideration  was  necessary  to  mini- 
mize the  amount  of  work  required  to  pre-empt  a  program,  since 
all  that  would  have  to  be  saved  to  reinstate  the  interrupted  pro- 
gram at  a  later  time  would  be  the  data  for  that  program  and  the 
register  contents  of  the  computer  module  running  the  program 
at  the  time  it  was  dumped. 

The  D82.5  computer  modules  employ  a  variable-length  in- 
struction format  made  up  of  quarter-word  syllables.  Zero-,  one-, 
two-,  or  three-address  syllables,  as  required,  can  be  associated  with 
each  basic  command  syllable.  An  implicitly  addressed  accumulator 
stack  is  used  in  conjunction  with  the  arithmetic  unit.  Indexing  of 
all  addresses  in  a  command  is  provided,  as  well  as  arbitrarily  deep 
indirect  addressing  for  data. 

Each  computer  module  includes  a  128-position  thin-film  mem- 
ory used  for  the  stack,  and  also  for  many  of  the  registers  of  the 
machine,  such  as  the  program  base  register,  data  base  register, 
the  index  registers,  limit  registers,  and  the  like. 

The  instruction  complement  of  the  D825  includes  the  usual 
fixed-point,  floating-point,  logical,  and  partial-field  commands 
found  in  any  reasonably  large  scientific  data  processor. 

Memory  module.  The  memory  modules  consist  of  independent 
units  storing  4096  words,  each  of  48  bits.  Each  imit  has  an  individ- 
ual power  supply  and  all  of  the  necessary  electronics  to  control 
the  reading,  writing,  and  transmission  of  data.  The  size  of  the 
memory  modules  was  established  as  a  compromise  between  a 
module  size  small  enough  to  minimize  conflicts  wherein  two  or 
more  computer  or  I/  O  modules  attempt  access  to  the  same  mem- 
ory module,  and  a  size  large  enough  to  keep  the  cost  of  duplicated 
power  supplies  and  addressing  logic  within  bounds.  It  might  be 
noted  that  for  a  larger  modular  processor  system,  these  trade-offs 
might  indicate  that  memory  modules  of  8192  words  would  be  more 
suitable.  Modules  larger  than  this — of  16,384  or  32,768  words,  for 
example — would  make  construction  of  relatively  small  equipment 
complements  meeting  the  requirements  set  forth  above  quite 


Part  5     The  PIMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


difficult.  The  cost  of  smaller  units  of  memory  is  offset  by  tlie 
lessening  of  catastrophe  in  the  event  of  failure  of  a  module. 

I/O  control  module.  The  I/O  control  module  executes  I/O  opera- 
tions defined  and  initiated  by  computer  module  action.  In  keeping 
with  the  system  objectives,  I/O  control  modules  are  not  assigned 
to  any  particular  computer  module,  but  rather  are  treated  in  much 
the  same  way  as  memory  modules,  with  automatic  resolution  of 
conflicting  attempted  accesses  via  the  switching  interlock  function. 
Once  an  I/O  operation  is  initiated,  it  proceeds  independently  until 
completion. 

I/O  action  is  initiated  by  the  execution  of  a  transmit  I/O 
instruction  in  one  of  the  computer  modules,  which  delivers  an  I/O 
descriptor  word  from  the  addressed  memory  location  to  an  inactive 
I/O  control  module.  The  I/O  descriptor  is  an  instruction  to  the 
I/O  control  module  that  selects  the  device,  determines  the  direc- 
tion of  data  flow,  the  address  of  the  first  word,  and  the  number 
of  words  to  be  transferred. 

Interposed  between  the  I/O  control  modules  and  the  physical 
external  devices  is  another  crossbar  switch  designated  the  I/O 
exchange.  This  automatic  exchange,  similar  in  function  to  the 
switching  interlock,  permits  two-way  data  flow  between  any  I/O 
control  module  and  any  I/O  device  in  the  system.  It  further 
enhances  the  flexibility  of  the  system  by  providing  as  many  possible 
external  data  transfer  paths  as  there  are  I/O  control  modules. 

Equipment  complements.  A  D825  system  can  be  assembled  (or 
expanded)  by  selection  of  appropriate  modules  in  any  combination 
of;  one  to  four  computer  modules,  one  to  16  memory  modules. 


Table  1    Specifications,  D825  modular  data  processing  system 


Computer  module: 
Computer  module,  type: 
Word  length: 

Index  registers; 

(in  each  computer  module) 

Magnetic  thin  film  registers; 
(in  each  computer  module) 

Real-time  clock: 

(in  each  computer  module) 

Binary  add: 

Binary  multiply; 

Floating  point  add; 

Floating-point  multiply: 

Logical  AND: 

Memory  type; 

Memory  capacity; 

I/O  exchanges  per  system; 
I/O  control  modules; 
I/O  devices: 
Access  to  I  0  devices; 

Transfer  rate  per  I/O  exchange: 
I  ^0  device  complement; 


4,  maximum  complement 

Digital,  binary,  parallel,  solid-state 

48  bits  including  sign  (8  characters,  6  bits 
each)  plus  parity 


128  words.  15  bits  per  word.  0.33-fisec 
read  write  cycle  time 

10  msec  resolution 


1.67  usee (average) 
35.0  (isec  (average) 
7.0  /isec  (average) 
34.0  fisec  (average) 
0.33  /isec 

Homogeneous,  modular,  random-access, 
linear-select,  ferrite-core 

55,536  words  (15  modules  maximum,  4096 
words  each) 

1  or  2 

10  per  exchange,  maximum 

54  per  exchange,  maximum 

All  I  0  devices  available  to  every  I  0  control 
module  in  exchange 

2,000,000  characters  per  second 

All  standard  10  types,  including  67  kc  mag- 
netic tapes,  magnetic  drums  and  discs,  card 
and  paper  tape  punches  and  readers,  char- 
acter and  line  printers,  communications  and 
display  equipment 


Fig.  2.  Typical  D825  equipment  array. 


one  to  ten  I/O  control  modules,  one  or  two  I/O  exchanges,  and 
one  to  64  I/O  devices  per  I/O  exchange  in  any  combination 
selected  from;  operating  (or  system  status)  consoles,  magnetic  tape 
transports,  magnetic  drums,  magnetic  disc  files,  card  punches  and 
readers,  paper  tape  perforators  and  readers,  supervisory  printers, 
high-speed  line  printers,  selected  data  converters,  .special  real-time 
clocks,  and  intersystem  data  links. 

Figure  2  is  a  photograph  of  some  of  the  hardware  of  a  com- 
pleted D825  system.  The  equipment  complement  of  this  system 
includes  two  computer  modules,  four  memory  modules  (two  per 
cabinet),  two  I/O  control  modules  (two  per  cabinet),  one  status 
display  console,  two  magnetic  tape  units,  two  magnetic  drums. 


Chapter  36  |  D825— a  multiple-computer  system  for  command  and  control  455 


a  card  reader,  a  card  punch,  a  supervisory  printer,  and  an  electro- 
static line  printer. 

D825  characteristics  are  summarized  in  Table  1. 

Summary  and  conclusion 

It  is  the  belief  of  the  authors  that  modular  systems  (in  the  sense 
discussed  above)  are  a  natural  solution  to  the  problem  of  obtaining 
greater  computational  capacity — more  natural  than  simply  to 
build  larger  and  faster  machines.  More  specifically,  the  organiza- 
tional structure  of  the  D825  has  been  shown  to  be  a  suitable  basis 
for  the  data  processing  facility  for  conmiand  and  control.  .-Mthough 
the  investigation  leading  toward  this  structure  proceeded  as  an 
attack  upon  a  number  of  diverse  problems,  it  has  become  evident 
that  the  requirements  peculiar  to  this  area  of  application  are,  in 
effect,  aspects  of  a  single  characteristic,  which  might  be  called 
structural  freedom.  Furthermore,  it  is  now  clear  that  the  most 
unique  characteristic  of  the  structure  realized — integrated  opera- 
tion of  freely  intercommunicating,  totally  modular  elements — 
provides  the  means  for  achieving  structural  freedom. 

For  example,  one  requirement  is  that  some  specified  minimum 
of  data  processing  capability  be  always  available,  or  that,  under 
any  conditions  of  system  degradation  due  to  failure  or  mainte- 
nance, the  equipment  remaining  on  line  be  sufficient  to  perform 
primary  system  functions.  In  the  D825,  module  failure  results  in 
a  reduction  of  the  on-line  equipment  configuration  but  permits 
normal  operation  to  continue,  perhaps  at  a  reduced  rate.  The 
individual  modules  are  designed  to  be  highly  reliable  and  main- 
tainable, but  system  availability  is  not  derived  solely  from  this 
source,  as  is  necessarily  the  case  with  more  conventional  systems. 
The  modular  configuration  permits  operation,  in  effect,  with  active 
spares,  eliminating  the  need  for  total  redundancy. 


A  second  requirement  is  that  the  working  configuration  of  the 
system  at  a  given  moment  be  instantly  reconstructable  to  new- 
forms  more  suited  to  a  dynamically  and  unpredictably  changing 
work  load.  In  the  D82.5,  all  communication  routes  are  public,  all 
modules  are  functionally  decoupled,  all  assignments  are  scheduled 
dynamically,  and  assignment  patterns  are  totally  fluid.  The  system 
of  interrupts  and  priorities  controlled  by  the  AOSP  and  the 
switching  interlock  permits  instant  adaptation  to  any  work  load, 
without  destruction  of  interrupted  programs. 

The  requirement  for  e.vpansibility  calls  simply  for  adaptation 
on  a  greater  time  scale.  Since  all  D82.5  modules  are  functionally 
decoupled,  modules  of  any  types  may  be  added  to  the  system 
simply  by  plugging  into  the  switching  interlock  or  the  I/O  ex- 
change. Expansion  in  all  functional  areas  may  be  pursued  far 
beyond  that  possible  with  conventional  systems. 

It  is  clear,  however,  that  the  D82.5  system  would  have  fallen 
far  short  of  the  goals  set  for  it  if  only  the  hardware  had  been 
considered.  The  .-VOSP  is  as  much  a  part  of  the  D82.5  system 
stnicture  as  is  the  actual  hardware.  The  concept  of  a  "floating" 
.\OSP  as  the  force  that  molds  the  constituent  modules  of  an 
equipment  complement  into  a  system  is  an  important  notion 
having  an  effect  beyond  the  implementation  of  the  D825.  One 
interesting  by-product  of  the  design  effort  for  the  D82.5  has.  in 
fact,  been  a  change  of  perspective:  it  has  become  abundantly  clear 
that  computers  do  not  run  programs,  but  that  programs  control 
computers. 


References 

.\ndeJ62;  Krog.\I61;  Lein.\.5T;  PortR6();  ThoniR6.3 


Chapter  37 


A  survey  of  problems  and  preliminary 
results  concerning  parallel  processing 
and  parallel  processors^ 

M.  Lehman 

Summary  After  an  introduction  which  discusses  the  significance  of  a  trend 
to  the  design  of  parallel  processing  systems,  the  paper  describes  some  of 
the  results  obtained  to  date  in  a  project  which  aims  to  develop  and  evaluate 
a  unified  hardware-software  parallel  processing  computing  system  and  the 
techniques  for  its  use. 

1.    Multiprogramming,  multiprocessing, 
and  parallel  processing 

A  brief  review  of  the  literature,  of  which  a  partial  listing  is  given 
in  the  bibliography,  reveals  an  active  and  growing  interest  in 
multiprogramming,  multiprocessing,  and  parallel  processing. 
These  three  terms  distinguish  three  modes  of  usage  and  also  serve 
to  indicate  a  certain  historical  development.  We  cannot  here 
attempt  to  trace  this  historv  in  detail  and  so  must  rely  on  the 
bibliography  to  credit  the  contributions  from  industrial,  university, 
and  other  research  and  development  organizations. 

The  emergence  of  autonomous  input-output  devices  first  sug- 
gested [Gill,  1958]  the  time-sharing  of  the  processing  and  periph- 
eral units  of  a  computing  system  among  several  jobs.  Thus  surplus 
capability  that  could  not  be  applied  to  the  processing  of  the 
leading  job  in  a  batch  processing  load,  at  any  stage  of  the  compu- 
tation, could  be  usefidly  applied  to  successor  jobs  in  the  work  load. 
In  particular,  while  any  computation  was  held  up  for  some  I/O 
activity,  the  single  main  processor  could  be  used  for  other  compu- 
tation. The  necessary  decision-taking,  scheduling,  and  allocation 
procedures  were  vested  in  a  supervisor  program,  within  which  the 
user-jobs  were  embedded,  and  the  resultant  mode  of  operation  was 
termed  Multiprogramming. 

The  use  of  computers  in  on-line  control  situations  and  for  other 
applications  giving  rise  to  ever-more  stringent  reliability  and 
availability  specifications,  resulted  in  the  construction  of  systems 
including  two  or  more  central  processing  units  [Leiner  et  al.,  1959; 
Bright,  1964;  Desmonde,  1964;  McCullough  et  al,  1965].  Under 

^Proc.  IEEE,  vol.  54,  no.  12,  pp.  1889-1901,  December,  1966. 


normal  circumstances,  with  all  units  operational,  each  could  be 
assigned  a  specific  activity  within  an  overall  control  program.  As 
a  result  of  the  multiplicity  of  units  in  such  Multiprocessing  Systems, 
failure  of  any  one  would  degrade,  but  not  immobilize,  the  system, 
since  a  supervisor  program  could  re-assign  activities  and  configure 
the  failed  unit  out  of  the  system.  Subsequently,  it  was  recognized 
that  such  systems  had  advantages  over  a  single  processor  system 
in  a  more  general  environment,  with  each  processor  in  the  system 
having  a  multiprogramming  capability  as  well. 

Finally,  following  from  ideas  first  exploited  in  the  Gamma  60 
Computer  [Dreyfus,  1958],  there  has  come  the  realization  that 
multi-instniction  counter  systems  can  speed  up  computation,  par- 
ticularly of  large  problems,  when  these  may  be  partitioned  into 
sections  which  are  substantially  independent  of  one  another,  and 
which  may  therefore  be  executed  concurrently — that  is,  in  parallel. 
When  the  several  units  of  a  multiprocessing  system  are  utilized 
to  process,  in  parallel,  independent  sections  of  a  job,  we  exploit 
the  macro-parallelism  [Lehman,  1965]  of  the  job,  which  is  to  be 
distinguished  from  micro-parallelism  [Lehman,  1965],  the  relative 
independence  of  individual  machine  instructions,  exploited  in 
look-ahead  machines.  This  mode  of  operation  is  termed  Parallel 
Processing  and,  as  in  PL/I  [IBM  OS/.360,  PL/I  Language  Specifica- 
tion, Form  C28-6571,  p.  74],  the  execution  of  any  program  string 
is  termed  a  Task.  We  note  that  parallel  processing  may,  and 
normally  will,  include  multiprocessing  activity. 

2.    The  approach  to  parallel  processing  system  design 

In  the  previous  section  we  indicated  that  the  prime  impetus  for 
the  development  of  parallel  processing  systems  arose  from  their 
potential  for  high  performance  and  reliability.  These  systems  may 
operate  as  pools  of  resources  organized  in  symmetrical  classes  and 
it  is  this  property  that  promises  High  Availabilitij .  They  also 
possess  a  great  reserve  of  power  which,  when  applied  to  a  single 
problem  with  the  appropriate  degree  of  parallelism,  can  yield  high 


Chapter  37  |  A  survey  of  problems  and  preliminary  results  concerning  parallel  processing  and  parallel  processors  457 


performance  and  fast  turn  around  time.  Surplus  resources  can  be 
applied  to  other  jobs,  so  that  the  system  is  potentially  efficient, 
displaying  a  peak-load  averaging  effect  and  hence  high  utilization 
of  hardware  [Corbato  and  Vyssotsky,  1965],  The  concept  of  sharing 
in  parallel  processing  systems  and  its  related  cost  reduction  is  not, 
however,  limited  to  hardware.  Perhaps  even  more  significant  is 
the  common  use  of  data-sets  maintained  in  a  system  library  or 
file,  and  even  concurrent  access  during  execution  from  a  high- 
speed store.  This  may  represent  considerable  economy  in  storage 
space  and  in  processing  time  for  I/O  and  internal  memory- 
hierarchy  transfers.  But  above  all  [Corbato  and  Vyssotsky,  196.5] 
it  facilitates  the  sharing  of  ideas,  experience,  and  results  and  a 
cross  fertilization  among  users,  a  prospect  which  from  a  long  term 
point  of  view  represents  perhaps  the  most  significant  potential  of 
large,  library-oriented,  multiproce.ssing  systems.  Finally,  in  this 
brief  summary  of  the  basic  advantages  of  parallel  processing 
systems,  we  refer  to  their  intrinsic  modularity,  which  may  yield 
an  expandable  system  in  which  the  onh'  effect  of  expansion  on 
the  user  is  improved  performance. 

.\dequate  performance  of  parallel  processing  systems  is,  how- 
ever, predicated  on  an  appropriately  low  level  of  overhead.  Allo- 
cation, scheduling,  and  supervisory'  strategies,  in  particular,  must 
be  simplified  and  the  related  procedures  minimized  to  comprise 
a  small  proportion  of  the  total  activity  in  the  system.  The  system 
design  must  be  based  on  performance  objectives  that  permit  a  user 
to  specify  a  time  period  and  a  tolerance  within  which  he  requires 
and  expects  to  receive  results,  and  the  cost  for  which  these  will 
be  obtained.  In  general  the  entire  system  must  yield  minimum 
throughput  time  for  the  large  job,  adequate  response  time  to  the 
terminal  requests  in  conversational  mode,  guaranteed  throughput 
time  for  real-time  tasks,  and  minimum  cost  processing  for  the 
batch-processed  small  job.  These  needs  require  the  development 
of  an  executive  and  supervisory  system  integrated  with  the  hard- 
ware into  a  single,  unified  computing  system.  Finally,  the  tech- 
niques and  algorithms  of  classical  computation,  of  problem  analy- 
sis, and  of  programming,  must  be  modified  and  new,  intrinsicalh' 
parallel  procedures  developed  if  fidl  advantage  is  to  be  gained 
from  exploitation  of  these  parallel  systems. 

Our  studies  to  date  represent  but  a  small  fraction  of  the  ground 
that  will  have  to  be  covered  if  effective  parallel  processing  s\  stems 
are  to  come  into  their  own.  It  is,  however,  abundantlv  clear  that 
such  systems  will  yield  their  potential  onlv  if  the  design  is  ap- 
proached on  a  liroad  but  unified  front  ranging  from  problem 

'W'e  differentiate  intuitively  between  executive  and  supervisory  activities. 
The  former  are  those  whose  costs  should  be  chargeable  to  the  individual 
user  directly,  whereas  the  latter  are  absorbed  in  the  system  ranning  costs. 


analysis  and  usage  techniques,  through  executive  strategies  and 
operating  systems,  to  logic  design  and  technology.  We  therefore 
present  concepts  and  results  from  each  of  these  areas,  as  obtained 
during  our  preliminary  investigation  into  the  design  and  use  of 
parallel  processing  systems. 


3.  Language 

3.1  Parallelism  in  high  level  languages 

The  analysis  of  high  level  language  requirements  for  parallel 
processing  has  received  considerable  attention  in  the  literature. 
We  may  refer  in  particular  to  the  paper  by  Conway  [196.3]  which 
discussed  the  concepts  of  Fork,  Join,  and  Quit,  and  the  recent 
review  by  Dennis  and  Van  Horn  [1966]. 

Recognizing  that  programming  languages  should  possess  capa- 
bilities that  express  the  .structure  of  the  computational  algorithm, 
Schlaeppi  [19??]  has  proposed  augmentations  to  PL/I-like  lan- 
guages that  portray  the  macro-parallelism  in  numerical  algorithms. 
These  in  turn  have  been  reflected  in  proposals  for  machine- 
language  implementation.  .\s  examples  we  discuss  Split,  Terminate, 
Assemble,  Test  and  Set  or  Wait  (interlock).  Resume,  Store-Test  and 
Branch,  and  External  Execute  instructions.  We  describe  here  only 
the  basic  functional  elements,  from  which  machine  instructions 
for  actual  realization  will  be  composed  as  suggested  by  practical 
programming  experience. 

3.2  Machine  level  instrtictions  for  tasking 

Split  provides  the  basic  task-generating  capability.  It  indicates  that 
in  addition  to  continuing  the  execution  of  the  present  instruction 
string  in  normal  fashion  a  new  task,  or  set  of  tasks,  may  be  initi- 
ated, execution  starting  at  a  specified  address  or  set  of  addresses. 
Such  potential  tasks  will  be  queued  to  await  pick-up  by  an  appro- 
priate processing  unit. 

Terminate  causes  cessation  of  activity  on  a  task.  The  terminat- 
ing unit  will,  of  its  own  volition,  access  an  appropriate  queue  to 
obtain  its  ne.xt  task.  Alternatively,  it  may  execute  an  executive 
allocation-task  to  determine  which  of  a  number  of  task-queues  is 
to  be  accessed  next  according  to  the  current  urgency  status  of  work 
in  the  system. 

Assemble  permits  the  merging  of  several  tasks.  The  first  (/i  —  1) 
tasks  in  an  n-way  parallel  set  belonging  to  a  single  job,  reaching 
the  assemble  instruction  terminate.  The  nth  task,  however,  will 
proceed  to  execute  the  program  string  which  constitutes  the 
continuation  of  all  n  tasks. 


458  Part  5  |  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


Test  and  Set  or  Wait  provides  an  interlock  facility.  Thus  a 
number  of  tasks  all  operating  on  a  common  data  set  may  be 
required  to  filter  through  certain  sections  of  program  or  data,  one 
at  a  time.  This  may  be  achieved  by  an  instruction  related  to  the 
S/.360  test  and  set  instruction  [Falkoff  et  al.,  1964],  but  causing 
the  task  finding  the  specified  location  to  be  already  set  to  go  into 
a  wait  state.  System  efficiency  requires  that  processors  do  not  idle, 
so  that  the  waiting  task  will  generally  be  returned  to  queue  and 
the  processor  released  for  other  work. 

Resume  directs  a  processor  or  processors  waiting  as  a  result 
of  a  test  on  a  specified  location,  to  proceed,  or  more  generally, 
that  specified  waiting  tasks  that  have  been  returned  to  queue  be 
re-activated  to  await  the  spontaneous  availability  of  an  appropri- 
ate processor. 

Test  and  Branch  Storage  Location  permits  communication  be- 
tween parallel  tasks  based  on  tests  analogous  to  the  register  tests 
of  uniprocessors,  but  associated  with  the  contents  of  storage  loca- 
tions. This  is  desirable  since  processor  registers  are  private  to  the 
processor  and  inaccessible  from  outside. 

External  Execute  is  a  special  case  of  the  general  interaction 
facility  discussed  in  Section  4  that  permits  related  tasks  to  influ- 
ence one  another.  This  can  be  achieved  through  the  application 
of  instructions  already  discussed.  It  is,  however,  more  efficient  to 
provide  a  new  facility  akin  to  the  Interrupt  concept.  By  applying 
this  Interaction  function,  a  task  may  cause  other  specified  tasks 
to  execute  an  instruction  at  a  specified  location,  each  on  comple- 
tion of  its  present  instruction.  Thus,  for  example,  a  number  of 
processors  searching  for  a  particular  item  in  a  partitioned  list  can 
be  caused  to  abandon  the  search  when  the  item  has  been  located 
by  one,  while  processors  searching  for  other  items,  or  otherwise 
busy,  will  not  be  redirected. 

4.  Interaction 

4.1    The  interaction  concept 

An  extension  of  the  task  interaction  concept  introduced  in  the 
preceding  section  is  fimdamental  to  efficient  parallel  processing. 
In  the  particular  example  cited,  the  interaction,  in  the  form  of 
an  external  execute  instruction,  forms  part  of  the  computational 
procedure.  In  fact,  many  other  situations  arise  in  which  processing 
for  inter-task  communication  may  be  detached  from  problem 
processing  and  be  carried  through  concurrently  in  autonomous 
units,  thereby  increasing  system  utilization. 

We  therefore  propose  to  associate  with  each  active  unit  in  the 
system  an  autonomous  Interaction  Controller.  Groups  of  controllers 


are  linked  by  a  special  bus.  This  provides  facilities  whereby  any 
one  unit  may,  at  a  given  time,  act  as  a  command  or  signal  source 
with  all  other  units  potential  recipients.  By  thus  systemizing 
inter-unit  communication  and  making  it  a  concurrent  activity,  we 
both  increase  system  utilization  and  remove  a  maze  of  intercon- 
necting cables.  Succeeding  subsections  describe  some  of  the  func- 
tions that  the  controllers  hilfiU  and,  briefly,  one  hardware  proposal 
for  their  realization. 

4.2    Interaction  activities 

In  present-day  systems  there  already  exist  activities  of  the  type 
to  be  classified  as  interaction.  Thus,  for  example,  in  System/.360 
we  find  a  CPU  to  Channel  Halt  I/O  facility,  channel  interniptions 
of  processors,  and  timer  interruptions.  In  extending  the  concept 
we  differentiate  among  three  classes  of  interaction. 

PROBLEM  INTERACTION.  These  relate  to  logical  dependencies 
between  tasks,  and  will  generally  require  waits,  forced  branches, 
or  terminations.  Search  termination,  previously  discussed,  is  an 
example  of  this  type  interaction,  as  are  data  and  instruction- 
sequence  interlocks. 

EXECUTIVE  INTERACTION.  This  activity  is  concerned  primarily 
with  the  allocation  of  system  resources.  Consider,  for  example,  the 
problem  of  processing  internipts  in  a  parallel  processing  system. 
These  will  usually  not  need  to  interrupt  a  computing  activity,  but 
may  await  the  spontaneous  availability  of  a  unit  at  a  Terminate, 
a  natural  breakpoint.'  If  an  interrupt  does  become  critical  it  should 
not  be  applied  to  a  specific  physical  unit.  Instead  the  interruption 
should  be  steered  to  that  unit  which,  by  virtue  of  the  work  it  is 
processing,  may  be  classed  as  Most  Interruptable.  Selection  of  the 
latter  may  be  obtained  ahead  of  time  and  is  maintained  by  the 
interaction  system,  on  the  basis  of  the  relative  urgency  of  tasks. 

Another  example  of  executive  interaction  concerns  the  constant 
provision  of  queue  status  information  to  all  active  units.  Besides 
simplifying  scheduling  activity  this  may  prevent  units  from  access- 
ing empty  queues,  reducing  both  storage  and  executive  interfer- 
ence. Similarly,  units  can  be  caused  to  access  a  previously  empty 
queue  when  an  entry  is  made,  obviating  continuous  testing  of 
queue  status. 

'This  is  possible  in  a  parallel  processing  system  since  tasks  are  smaller  than 
jobs  and  since  there  are  many  processors.  Furthermore,  units  operate 
anonymously.  That  is,  on  picking  up  a  task,  a  imit  records  the  task  identity 
in  an  internal  register  and  its  own  identity  in  a  taljle  associated  with  the 
work  queue.  Other  processors  do  not,  therefore,  know  how  tasks  and 
processors  are  matched  at  any  time,  since  this  is  a  matter  of  chance,  and 
determination  would  require  an  extensive  and  wasteful  table  search. 


Chapter  37  |  A  survey  of  problems  and  preliminary  results  concerning  parallel  processing  and  parallel  processors  459 


The  interaction  system  also  supports  other  activities  associated 
with  accounting,  recording,  and  general  svsteni  supervision. 

SYSTEM  INTERACTION.  System  interaction  provides  controls  and 
interlocks  for  operation  and  maintenance  of  the  physical  system. 
It  includes,  for  example,  interchange  of  information  between 
active  units  about  the  validity  of  storage  map  entries,  storage 
protection  control,  queue  interlocks,  checks  and  counts  of  unit 
availability,  the  initiation  of  routine  and  emergency  diagnostic  and 
maintenance  activity,  and  the  isolation  of  malfunctioning  units. 

SUMMARY.  The  preceding  paragraphs  have  indicated  some  of  the 
many  applications  of  an  interaction  controller.  The  common 
property  which,  for  practicality,  has  been  used  to  identify  poten- 
tial interaction  activities  is  that  they  should  be  autonomous  rela- 
tive to  the  main  computational  stream  and  that  their  e.xecution 
should  not  require  access  to  storage. 

4.3    The  interaction  controller 

4.3.1.  The  bitsic  system  hunlware  architecture.  It  is  not  intended 
to  give  a  full  description  of  an  interaction  controller  in  the  present 
paper.  We  shall,  however,  outline  its  basic  structure,  indicate  its 
mode  of  operation,  and  list  some  of  the  proposed  interaction 
instnictions,  termed  Directives. 

As  a  first  step  we  introduce,  in  Fig.  1,  a  diagrammatic  descrip- 
tion of  an  overall  representative  hardware  system.  This  consists 
of  central  processors  (Pi)  with  local  storage  (LSi),  I/O  processors 
(SCi),  storage  modules  (Si),  a  requestor-storage  queue  (Qi),  and  a 
communication  system  functionally  equivalent  to  a  crossbar 
switch.  An  I/O  area,  including  a  bulk-store,  files,  channels  (Ch), 
devices,  device  control  units  (Cu),  and  interconnection  networks, 
is  indicated  in  less  detailed  fashion. 

4.3.2.  Interaction  controllers.  Interaction  controllers  (IC)  are 
associated  with  all  central  and  I/O  processors,  and  communicate 
with  each  other  over  a  special  bus.  Similarly  localized  interaction 
systems  may  provide  a  facility  for  certain  classes  of  I/O  units  or 
devices  to  interact  amongst  themselves. 

To  be  economically  feasible,  the  Interaction  Controller  must 
be  simple.  Figure  2  illustrates  a  structure  which  includes  about 
two  hundred  and  fifty  bits  of  storage,  of  which  about  half  are 
organized  in  registers.  The  remainder  are  used  as  status  bits  or 
appear  in  the  controller-processor  interface.  Control  is  obtained 
from  a  read-only  store,  whose  capacity  depends  on  the  size  of  the 
directive  repertoire  (an  interaction  directive  being  analogous  to 


a  processor  instruction)  and  the  number  of  interaction  functions 
it  is  required  to  implement. 

Controller  connection  to  the  ten-bit  wide  interaction  bus  is  by 
means  of  OR  gates.  When  an  interaction  is  occurring,  one  and 
only  one  controller  will  be  in  command  of  the  bus.  Figure  3 
illustrates  the  sequence  of  events  required  to  implement  an  inter- 
action. 

The  controller  required  by  its  associated  processor  to  initiate 
an  activity  will  await  availability  of  the  bus,  indicated  by  an  ALL 
ZERO  state,  and  will  then  attempt  to  seize  control  by  transmitting 
a  unique  identiKing  four-out-of-eight  code.  Should  more  than  one 
controller  attempt  to  .seize  the  bus  at  the  same  time,  a  conflict 
resolution  procedure  is  initiated.  This  is  based  on  the  simultaneous 
transmission  by  all  requesting  controllers  of  a  second,  two  byte, 
identify  ing  code.  Each  byte  consists  of  one  or  more  ones  followed 
by  all  zeros.  .\  simple  comparison  by  each  controller  of  its  trans- 


Fig.  1.  A  representative  system  hardware  configuration. 


460  Part  5  |  The  PMS  level 


Section  3  j  Computers  for  multiprocessing  and  parallel  processing 


Processor  or  channel 
interface 


Job  ident. 


Registers 


Registers 


Job  priority  reg. 


Interlock  id,  reg. 


Directive  reg. 


Read  -  only 
store 


Seizure  code 
Class  no. 


Eguolity 
detecter 


Interaction  bus 


Fig.  2.  The  interaction  controller. 


mitted  signals  with  the  state  of  the  bus,  identifies  to  itself  that 
controller  having  the  most  ones  in  each  byte,  since  it  will  have 
found  a  match  on  both  comparisons.  This  enables  it  to  seize  the 
bus  and  to  switch  to  the  command  state.  All  remaining  controllers 
remain  in  the  listening  state. 

The  controller  in  command  of  the  bus  then  transmits  signals 
which  select  recipients  for  the  directives  which  are  to  follow. 
Other  controllers  ignore  all  further  communications  until  the  next 
selection  signal  appears. 

4.4    Interaction  directives 

A  signal  designating  the  interaction  function  required  by  a  proc- 
es.sor  is  transmitted  across  the  processor/controller  interface,  as 
the  result  of  the  execution  of  some  processor  instruction.  The 
processor  will  then  generally  continue  its  execution  sequence 
unless  or  until  it  is  required  to  pass  on  a  second  interaction  func- 
tion before  a  previously  issued  function  has  been  completed.  Upon 
receipt  of  the  interaction  command,  and  after  successful  seizure 
of  the  bus  as  described,  the  command  controller  may  initiate 


I  Interaction  required  ?| 


Bus  free  ? 


I  Conflict  ?  I 


I  Resolve]- 


I  Select  recipients  ] 
I  Transmit  order  or  question 


Fig.  3.  The  interaction  sequence. 

execution  of  the  interaction  by  transmitting  a  sequence  of  one  or 
more  directives  to  the  selected  units.  A  basic  set  of  directives  is 
listed  in  Table  1. 

The  Compare  directives  are  most  frequently  used  to  seize  the 
bus  and  to  select  a  subset  of  the  controllers  for  the  receipt  of 
subsequent  directives.  The  remaining  units  ignore  further  direc- 
tives until  alerted  by  an  Attention  signal  or  until  Free  Bus  provides 
the  release  that  permits  waiting  controllers  to  attempt  to  seize 
the  bus.  Receive  provides  for  transmittal  of  data  between  control- 
lers; for  example,  transmission  of  a  machine  instruction  to  a  se- 
lected set  of  controllers,  followed  by  the  directive  Interact.  Thus 
this  sequence  could  realize  the  basic  interaction  function.  External 
Execute  is,  however,  considered  so  fundamental  to  efficient  exploi- 
tation of  a  parallel  processing  system  that  we  include  it  as  an 

Table  1 

Send  and  Compare 
Compare 
Received 
Set  Status  Bits 

Interact 
External  Execute 
Attention 
Free  Bus 


Chapter  37  |  A  survey  of  problems  and  preliminary  results  concerning  parallel  processing  and  parallel  processors  461 


explicit  directive.  Status  bits  that  may  be  set  or  reset  by  appropri- 
ate directives,  provide  data  on  the  status  of  various  systems  queues, 
on  the  interruptabihty  of  given  processors,  on  Wait  status,  and 
so  on. 

5.    Storage  communication 

The  fact  that  interest  in  large  parallel  processing  systems  is  in- 
creasing rapidly  as  technology  enters  into  the  integrated  or  mono- 
lithic era  is  no  coincidence.  Such  systems  will  not,  in  fact,  be 
practical  for  general  purpose  application  until  miniaturization 
reaches  the  stage  where  the  large  amoimt  of  hardware  required 
can  be  assembled  in  compact  fashion.  This  need  is  most  apparent 
when  one  considers  communication  between  the  high-speed  store 
and  the  various  classes  of  processors,  which  may  collectively  be 
termed  Requestors.  .'Mready  in  presently  available  systems,  the 
transmission  delay  between  storage  and  requestors  is  of  the  same 
order  of  magnitude  as  the  storage  cycle  time;  and  cycle  times  are 
still  decreasing. 

Formulation  of  a  hardware  model  as  in  Fig.  1  led  to  the  imme- 
diate conclusion  that  feasibility  of  the  interconnection  of  large 
numbers  of  units  had  first  to  be  established.  Many  possible  systems 
were  considered,  and  preliminar\  studies  concluded  that  the 
crossbar  switch  was  the  most  appropriate  system  for  earl)'  stud\' 
in  view  of  its  regular  structure,  simplicity,  and  basic  modularity. 
More  particularly,  monolithic  crossbar  modules  are  visualized 
which  it  will  be  possible  to  interconnect  to  provide  networks  of 
any  required  dimensions.  Alternatively,  or  additionally,  other 
interconnections  of  these  modules  can  provide  highly  available, 
multi-level  tnuiking  systems. 

In  addition  to  the  switch  proper,  the  crossbar  network  requires 
a  selection  and  control  mechanism.  It  is  moreover  appropriate  to 
locate  the  queues,  which  store  all  but  one  of  a  group  of  conflicting 
requests,  within  the  switching  area.  A  switch  complex,  as  in  Fig. 
4,  has  been  designed  for  a  system  configuration  including  twenty- 
four  requestors,  thirtv-two  memory  modules,  thirty-two  data  plus 
four  parity  bit  words,  and  sixteen  plus  two  parit\'  bit  addresses. 

The  result  of  this  design  study  shows  that  the  size  and  com- 
plexity of  such  a  switch  is  not  excessive  for  a  large  scale  system. 
In  its  simplest  form  and  using  standard  high-performance  logical 
devices,  with  a  fan-in  of  four,  a  fan-out  of  ten  and  a  four-way  OR 
capability,  its  use  leads  to  a  worst  case  delay  of  some  seven  logical 
levels  in  the  control  and  queue  decision  circuits  and  two  levels 
in  each  direction  of  the  switch.  The  switch  uses  between  two  and 
three  times  as  many  circuits  as  a  central  processor  such  as  the 
model  75  of  System/360.  While  this,  in  itself,  represents  a  consid- 


erable amount  of  hardware,  it  is  still  an  order  of  magnitude  less 
than  the  hardware  found  in  the  units  that  the  switch  is  intercon- 
necting. Moreover,  its  regular  structure  and  simple,  repetitive 
logic  suggest  ultimate  economical  realization  using  monolithic 
circuit  techniques. 

6.  Usage 

6.1    The  executive  system 

The  basic  properties  outlined  in  Sec.  2  give  parallel  processing 
systems  the  potential  to  overcome  many  of  the  ills  and  shortcom- 
ings that  presently  beset  computer  systems.  For  maximum  effec- 
tiveness, the  system  must  be  library-  or  file-oriented.  It  can,  how- 
ever, be  exploited  efficiently  only  if  the  overhead  resulting  from 
executive  control  and  supervisory  activity  does  not  strangle  the 
system.  More  particularly,  the  gains  from  the  sharing  of  resources 
and  any  peak  averaging  effect  must  exceed  any  additional  over- 
head due  to  resource  allocation  procedures,  conflict  resolutions, 
and  other  processing  activity  arising  from  the  concurrent  operation 
of  many  units.  Thus  a  unified  and  integrated  design  approach  is 
required  in  which  software  and  hardware,  operating  system  and 
processing  units,  lose  their  separate  identities  and  merge  into  one 


Fig.  4.  The  centralized  crossbar  switch. 


End  of 
storoge 
cyclei 


%  oddress  bits  ond  36  doto  bits  H 

'  Q. 


36  data  bits  and 


Storage 
select, 


Request 
signol,, 


•■To  other  requestors  and  other 
„  switching  section  inputs 


From  other  decoders 


To  other  sconners 


Request 
signal , 


Accept  signal,. 


Decision 
section 


Crossbor 
switch 


36  data  bits  and 


16  oddress  bits  ond  36  dota  brts 


462  Part  5  |  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


overall  complex,  for  which  allocation  and  scheduling  procedures, 
for  example,  are  as  basic  and  as  critical  as  arithmetic  operations. 

Equally  significant  to  the  successful  exploitation  of  parallel 
processing  potential  are  the  problems  of  data  management,  man- 
machine  interactions;  and,  most  generally,  problem  preparation 
and  usage  of  the  system.  We  restrict  the  present  discussion  to  brief 
comments  on  programming  techniqvies  for  task  generation  and  on 
the  development  of  algorithms  possessing  macro-parallelism.  In 
particular  we  indicate  that  multi-instruction-counter  systems  can 
be  profitably  applied  to  the  solution  of  the  large  problems  whose 
computing  requirements  tax  the  speed  capability  and  storage  of 
the  largest  computer  and  the  patience  of  their  users.  In  the  fol- 
lowing section  we  evaluate  these  proposals  by  quoting  some  per- 
formance measurements  obtained  from  an  executing  simulator. 

6.2    Programmed  task  generation 

Study  of  the  usage  of  parallel  processing  systems  for  the  rapid 
solution  of  large  real-time  problems  involves  two  aspects.  On  the 
one  hand  we  must  consider  the  development  of  algorithms  dis- 
playing an  appropriate  form  of  macro-parallelism.  On  the  other 
hand  programming  techniques  must  be  developed  for  efficient 
exploitation  in  terms  of  both  problem-  and  machine-oriented 
instnictions,  such  as  those  discussed  in  Sec.  4. 

It  is  appropriate  to  discu.ss  programmed  task  generation  first. 
For  simplicity  we  consider  a  job  segment  that  requires  n  executions 
of  a  procedure  I.  The  procedure  will  itself  include  modification 
of  index  registers  or  other  changes  that  distinguish  the  individual 
tasks.  We  assume  that  on  completion  of  all  n  tasks,  a  new  proce- 
dure /  should  be  initiated.  Moreover,  should  processing  power  be 
available  at  a  time  when  n  executions  of  /  have  been  initiated  but 
not  all  n  completed,  we  assume  that  an  independent  procedure 
K,  belonging  to  the  same  job,  may  be  initiated.  In  the  simplest 
case  K  will  be  a  terminate  instruction  which  releases  the  processor, 
and  makes  it  available  to  process  other  work  as  determined  from 
the  work-queue  complex. 

A  =  0 
B  =  0 
C  =  0 

ST       IF  A/  -  B  <  1  THEN  GO  TO  IN       Suppress  spht  if  nth  task 

being  initiated 

A  =  A  -I-  1 

IF  A  >  P  THEN  GO  TO  IN  Split  if  less  than  p  proces- 

sors allocated 

SPLIT  TO  ST 
B  =  B  -I-  1 

IN      IF  B  >  iV  THEN  GO  TO  FIN  If  all  n  /-tasks  started, 

proceed  with  K 


CALL  /  PROCEDURE 
C  =  C  -I-  1 

IF  C  <  N  THEN  GO  TO  IN  If  all  n  /-tasks  completed, 

proceed  with  / 

CALL ; PROCEDURE 
FIN    CALL  K  PROCEDURE 

Execution  of  split  and  terminate  instructions  involves  executive 
overheads,  so  that  these  instructions  should  not  be  used  indiscrim- 
inately. Within  a  system  in  which  a  maximum  of  p  processors  are 
available  to  a  job,  it  is  pointless  to  partition  a  job,  at  any  one  time, 
into  more  than  p  tasks.  It  is,  however,  undesirable  to  guarantee 
a  user  that  p  processors,  or  even  more  than  one  processor,  will 
execute  his  program.  A  simple  task  generation  scheme  that  makes 
as  many  entries  in  the  task  queue  as  there  are  potentially  concur- 
rent parts  of  the  algorithm  (for  example,  from  a  loop  containing 
a  split  instruction)  is  inefficient  when  that  number  is  much  larger 
than  the  number  of  processors  that  happen  to  be  available.  The 
technique  also  leads  to  very  large  queues.  An  alternative,  termed 
Onion  Peeling  by  us,  puts  the  instruction  sequence  containing  the 
split  at  the  head  of  procedure  /  and  ends  each  execution  of  the 
procedure  with  a  terminate.  This  restricts  the  queue  length  for 
this  job  segment  to  one  but  it  otherwise  is  as  inefficient  as  the 
previous  method. 

A  Modified  Onion  Peeling  scheme  (MOP)  restricts  the  split  and 
terminate  overhead  to  at  most  one  more'  than  the  number  of 
processors  actually  applied  to  the  segment.  It  also  ensures  that 
processing  is  completed  as  quickly  and  as  efficiently  as  possible 
with  the  number  of  processors  that  become  available  to  the  job 
segment.  Thus  if  during  execution  no  further  processors  are  freed, 
the  n  tasks  are  executed  sequentially  with  only  one  split  and  no 
terminate.  If,  on  the  other  hand,  some  other  number  of  processors 
is  used  for  execution,  the  procedure  is  speeded  up  accordingly. 
The  maximum  number  p  of  processors  that  may  be  applied  to  the 
job  may  be  limited  by  the  number  of  processors  in  the  system  and 
available,  or  by  executive  edict. 

The  basic  scheme  was  illustrated  by  the  above  program,  in 
which  the  first  expressions  following  the  ZEROing  of  counters 
ensures  that  no  unnecessary  splits  are  queued. 

'This  is  not  quite  accurate.  The  simple  MOP  algorithm  presented  here 
does  not  explicitly  interlock  the  split  sequence.  There  is  therefore  a  possi- 
bility that  unnecessary  task-calls  may  be  queued  during  the  execution  of 
the  split  which  is  to  generate  the  nth  task.  The  probability  of  this  is, 
however,  small,  while  the  degradation  arising  from  an  interlock  could  be 
significant,  and  the  algorithm  in  the  form  given  appears  more  economical. 


Chapter  37  j  A  survey  of  problems  and  preliminary  results  concerning  parallel  processing  and  parallel  processors 


6.3  Macro-parallelism 

Commonly  used  numerical  algorithms,  data  processing  procedures, 
and  computer  programs  are  generally  sequential  in  nature.  The 
reason  for  this  is  largely  historical,  a  consequence  of  the  fact  that 
the  Mechanisms,  human,  mechanical,  and  electronic,  used  in 
developing  and  executing  these  procedures  have  been  incapable 
of  significant  parallel  activity,  other  perhaps  than  the  simultane- 
ous, coordinated  use  of  manv  humans.  The  advent  of  parallel 
processing  systems  thus  calls  for  the  modification  of  accepted 
techniques  to  expose  any  inherent  parallelism.  The  resultant  pro- 
cedures must  then  be  further  adapted  to  make  parallel  tasks  of 
such  a  magnitude  that  the  overhead  involved  in  their  generation 
becomes  insignificant.  But  the  ultimate  benefit  from  parallel  execu- 
tion will  be  obtained  only  by  going  back  to  the  problems  them- 
selves. These  must  be  analyzed  anew,  .algorithms  must  be  devel- 
oped that  make  it  possible  to  exploit  the  parallel  executing  capa- 
bility, by  introducing  into  the  mathematical  and  program  model 
parallelism  that  ultimately  reflects  the  parallelism  of  the  physical 
system  or  phenomena  being  studied.  In  this  need  to  return  to 
fimdamentals,  the  situation  is  somewhat  analogous  to  the  early 
days  of  electronic  computing,  when  attempts  at  commercial  ap- 
plication were  largely  frustrated  until  it  was  realized  that  wide- 
spread application  required  the  development  of  new  techniques, 
rather  than  the  adaptation  and  mechanization  of  existing  proce- 
dures. 

kt  the  present  time,  however,  our  direct  activity  in  problem 
analysis  has  concentrated  mainly  on  the  adaptation  of  existing 
numerical  techniques  for  parallel  processing,  for  problems  in 
which  the  basic  macro-parallelism  was  self-evident.  These  include, 
for  e.xample,  linear  algebra  and  the  solution  of  elliptic  partial 
differential  equations.  In  these  areas  the  extent  and  nature  of  the 
parallelism  had  previously  led  to  proposals  for  vector  processing 
systems  such  as  Solomon  [Slotnick  et  al.,  1962;  Gregory  and 
McRevnolds,  196.3]  and  Vamp  [Senzig  and  Smith,  1965].  Other 
areas  in  which  the  parallelism  is  self-evident  but  where  vector 
processors  prove  less  effective  are  those  in  which  the  algorithms 
model  distinct  physical  activities  such  as  in  file  processing  and 
Monte  Carlo  techniques.  For  all  significant  problems  investigated 
[Schlaeppi,  19??]  it  was  possible  to  establish  the  existence  of 
parallel  tasks  of  such  a  length  that  tasking  overheads  could  be 
e.xpected  to  be  negligible. 

Other  classes  of  problems  have  been  studied,  both  in  terms  of 
the  extension  of  existing  algorithms  and  the  development  of  new 
ones.  In  particular  we  refer  to  the  extraction  of  polynomial  roots 
[Shedler  and  Lehman,  1966],  solution  of  equations  [Shedler, 


1966],  and  the  solution  of  linear  differential  equations  [Niever- 
gelt,  1964],  [Miranker  and  Liniger,  1967].  These  various  studies, 
not  all  directly  related  to  the  present  project,  were  more  mathe- 
matical in  nature,  and  to  the  best  of  our  knowledge,  no  attempt 
has  yet  been  made  to  develop  efficient  parallel  computer  programs. 
Thus,  while  numerical  methods  are  beginning  to  emerge  which 
enable  the  exploitation  of  macro-parallelism  in  the  solution  of 
time-limited  problems,  and  from  which  it  appears  that  significant 
reductions  may  be  obtained  in  throughput  times,  much  work 
remains  to  be  done  on  re-programming  the  problems  themselves. 

7.  Simulation 

7.1  Simulation  as  a  desig^n  tool 

It  has  been  our  experience  with  simulation  that  its  principal 
fimction  as  a  design  tool  is  to  focus  attention  on  features  that 
require  investigation  and  e.xplanation.  Many  results,  qualitative 
and  quantitative,  that  are  obtained  during  simulation  experiments 
may  also  be  obtained  analytically.  It  is,  however,  the  insight  and 
understanding  gained  from  the  design  of  simulation  experiments 
and  the  analysis  of  their  results  that  draws  attention  to  specific 
details  and  difficulties.  The  undeniable  value  of  simulation  in 
development  and  design  is  therefore  quite  different  from  that  in 
system  evaluation,  where  meaningful  performance  figures  may  be 
obtained  when  the  work  load  is  well  defined. 

7.2  The  executing  simulator 

In  the  present  study  simulation  was  seen  as  fulfilling  a  number 
of  additional  fimctions.  In  particular  it  made  available  a  usable 
working  model  of  a  parallel  processing  system.  This  would  give 
potential  users  the  incentive  to  undertake  actual  programming  and 
to  gain  limited  operational  experience.  An  e.\ecuting  simulator  was 
also  required  for  the  investigation  of  what  is  commonly  regarded 
as  the  most  immediate  question  in  parallel  processing,  the  extent 
of  performance  degradation  due  to  storage-access  interference  and 
executive  (queue-access)  interference.  Such  an  executing  simulator 
is  now  operational  and  its  use  is  discussed  in  the  next  section.  We 
note  parentheticall\  that  a  limitation  of  this  type  simulator  is  its 
speed.  For  the  evaluation  of  total  system  performance  over  any 
length  of  time,  particularly  when  using  a  computer  itself  much 
slower  than  the  simulated  system,  only  gross,  nonexecuting.  sim- 
ulation is  reasonable  [Katz.  1966]. 

The  system  presently  modeled  in  the  executing  simulator  in- 
cludes the  processors,  switch,  and  Storage  Modules  of  Fig.  1.  The 
storage  modules  are  accessed  through  a  fully  interleaved  address 


Part  5  I  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


structure,  though  it  is  clear  that  in  any  realization  interleaving 
will  be  partial,  both  to  sustain  high  availability  and  to  decrease 
storage  interference  between  independent  jobs.  The  individual 
processors  have  a  System/360-like  stnicture  [Blaauw  and  Brooks, 
1964]  and  execute  an  augmented  subset  of  S/.360  machine  lan- 
guage. The  nonstandard  instructions  added  to  the  repertoire  in- 
clude the  functions  discussed  in  Section  4.  The  local  store  LSi, 
to  be  used  also  as  an  instruction  buffer,  is  however  not  included 
in  the  model  for  which  the  interference  results  are  quoted  in  the 
next  section.  The  simulator  configuration  is  parameterized  so  that, 
for  example,  the  numbers  of  storage  modules  and  processors, 
instrviction  execution  times  (in  storage  cycles),  and  the  nature  of 
statistics  gathered  and  printed  may  be  selected  for  each  run.  The 
program  itself  is  modular,  and  both  system  features  and  measure- 
ment facilities  may  be  expanded  or  modified  as  required. 

7.3    Simulator  experiments 

7.3.1  Kernels.  Simulation  experiments  first  concentrated  on  an 
investigation  of  storage  interference  arising  in  the  execution  of 
typical  kernels  from  numerical  analysis.  The  results  indicated  that 
under  the  limited  condition  of  the  experiments  and  for  a  storage 
module-to-processor  ratio  of  two,  interference  would  degrade 
performance  by  less  than  twenty  percent,  dropping  to  some  five 
percent  for  storage  module-to-processor  ratio  of  eight.  Addition 
of  a  local  processor  store  and  its  use  as  an  instruction  buffer 
effectively  eliminated  interference,  as  expected,  indicating  that 
it  had  been  substantially  due  to  instruction-fetch  interference. 

These  results  were  considered  to  have  been  generated  under 
conditions  too  restrictive  to  permit  generalization.  In  particular 
each  set  referred  only  to  concurrent  executions  of  a  single  loop. 
Thus  more  recent  experiments  have  included  many  runs  of  a 
matrix-multiply  subroutine  and  the  solution  of  an  electrical  net- 
work problem  using  an  appropriately  modified  version  of  the 
Jacobi  variant  of  the  Gauss-Seidel  solution  of  a  set  of  linear  alge- 
braic equations. 

7.3.2  The  matrix  tnultiplication.  The  Matrix  Multiply  program 
was  written  in  two  versions.  A  classical  sequential  program  ex- 
cluding all  the  special  instructions  provided  the  standard  on  which 
measurement  of  the  parallelism  overhead  and  interference  could 
be  based.  The  second,  parallel,  program  used  the  onion  peeling 
rather  than  the  MOP  algorithm  described  in  Sec.  7.2.  The  product 
matrix  was  partitioned  by  rows,  with  the  computation  of  each 
comprising  one  task.  The  experiments  were  performed  for  square 
matrices  of  dimensions  thirty-nine  and  forty  with  from  one  to 
sixteen  processors  and  sixteen  to  sixty-four  storage  modules.  Two 


sizes  of  matrices  were  used  to  isolate  the  effect  of  commensurate 
periodicities  of  array  mapping  with  the  address  structure  of  the 
store,  which  demonstratively  had  significant  influence  on  the 
results. 

Instniction  execution  times  for  the  most  frequently  executed 
instructions  used  in  the  experiment  are  given  in  Table  2. 

These  times  exclude  the  instruction  fetch  time  (one  instruction 
for  each  fetch),  since  these  are  overlapped  unless  storage  conflict 
occurs,  when  a  request  must  be  queued.  The  arithmetic  operations 
may  also  include  a  data  fetch  (RX  instructions)  in  which  case  a 
further  store  access  time  is  required. 

In  the  absence  of  an  internal  instruction  buffer,  processors 
executing  the  same  program  string  interfere  with  each  other 
continuously  during  instruction  fetches.  To  minimize  this  effect 
for  loops  that  are  short  relative  to  the  width  of  the  interleaving, 
it  is  profitable  to  unwind  such  loops  by  repetition  so  that  the 
resultant  string  stretches  as  far  as  possible  across  the  interleaved 
store.  The  program  was  unwound  in  this  way.  We  note,  however, 
that  it  is  in  fact  better  [Rosenfeld,  1965]  to  repeat  the  loop, 
appropriately  modified,  several  times  across  the  interleaved  store, 
directing  successive  processors  to  successive,  but  unconnected, 
loops.  This  can  decrease  interference  by  as  much  as  twenty  percent 
over  the  previous  case. 

Some  results  of  the  simulation  are  given  in  Table  3  and  plotted 
in  Figs.  5  and  6. 

We  note  that  running  time  (col.  4)  is  defined  as  the  interval 
between  the  start  of  the  first  processor  on  its  first  task  and  the 
completion,  by  the  last  processor  to  finish,  of  its  final  task.  Since 
an  onion  peel  technique  has  been  used  for  the  splitting,  there  is 
an  interval  (of  order  70  storage  cycles)  between  the  start  of  suc- 
cessive tasks.  There  is  also  an  initial  interval  (87  memory  cycles) 
in  which  the  first  processor  initializes  the  program.  Finally,  the 
finish  of  processors  is  staggered  and,  in  particular,  for  the  sixteen- 
processor  case,  eight  processors  are  assigned  two  tasks  (rows)  in 
succession,  and  eight,  three  tasks.  The  former  processors  will,  of 


Table  2 


Instruction 


Fixed  Point  Addition 
Floating  Point  Addition 
Floating  Point  Multiplication 
Floating  Point  Division 
Split 

Terminate 

New  Task  Fetch  (Part  of  Terminate) 


Execution  time  in  storage  cycles 


0.4 
0.5 
1.0 
2.0 
25.0 
25.0 
25.0 


Chapter  37  j  A  survey  of  problems  and  preliminary  results  concerning  parallel  processing  and  parallel  processors  465 


Table  3 

Results  of  the  matrix  multiply  simulation 

1 

2 

4 

.5 

6 

7 

H 

9 

10 

11 

No.  of 

Total 

Storage 

nterfererwe 

Exec. 

No.  of 

Storage 

No.  of 

storage 

Run 

proc. 

interf. 

storage 

utilization 

proc. 

mods. 

dim. 

time 

time 

Time 

"" 

% 

acces.ses 

% 

Notes 

1 

64 

40 

427 

427 

1.02 

2 

NA 

459K 

1.69 

S6QuentiaI  progrsm 

1 

64 

40 

429 

429 

0.21 

0.05 

NA 

460K 

1.68 

Interf 6renc6  between 

2 

64 

40 

216 

432 

1.77 

0.4 

0.33 

460K 

3.3 

instruction  &  dat3  fetches 

4 

64 

40 

109 

436 

5.79 

13 

0.39 

460K 

6.6 

8 

64 

40 

56 

445 

14.4 

3.3 

0.68 

460K 

13.0 

16 

64 

40 

35 

461 

30.3 

7.0 

0.76 

460K 

25.0 

16 

32 

40 

38 

507 

75.9 

17.7 

0.88 

460K 

45.4 

16 

16 

40 

47 

639 

207,0 

48.2 

0.64 

460K 

72.1 

16 

64 

39 

33 

428 

26.1 

6.5 

NV 

427K 

26.9 

Col.  9  X  Col.  1 
Col.  5  X  Col.  2 


Note:  All  times  in  thousands  of  storage  cycles. 
NA— Not  Applicable 
NV-Not  Available 


course,  terminate  considerably  earlier  than  the  latter.  Thus,  as 
indicated  by  the  corresponding  entry  in  column  four,  the  particu- 
lar mode  of  partitioning  is  not  optimum  if  the  shortest  execution 
time  is  to  be  obtained.  From  a  svstem  efficiency  point  of  view, 
however,  and  in  actual  operation  with  other  jobs  and  tasks  in  the 
system,  it  is  of  no  consecjuence  since  processor  idling  does  not 
actually  occur.  New  tasks,  perhaps  arising  from  quite  different  jobs, 
are  initiated,  according  to  some  scheduling  strategy,  whenever  a 
processor  becomes  spontaneously  available. 


Parollel  processor 
program 


Uniprocessor 
progrom 


(40  <  40) «( 40 1 40) 
N,„=S4 


16  tasks 

V.x  40tasks 
'  '  '  '  I  '•  N„ 


5  10  15 

Number  of  processors 


(40x40)«(40x40) 
N.m  =64 


Totol  * 

processors  octive 


Total  deloy  due  to 
storage  Interference 


5  10  15 

Number  of  processors 


Fig.  5.  Execution  time  for  matrix  multiply. 


Fig.  6.  Total  processor  time  and  interference  in  matrix  multiply  modules. 

In  adchtion  to  nm  time,  we  define  a  total  processor  time  (col. 
5).  This  represents  the  sum  total  of  time  that  individual  processors 
were  active  in  the  program  and  is  therefore  a  reflection  of  total 
processor  nmning  cost.  Storage  interference  (cols.  6,  7)  measures 
the  total  time  that  processors  were  inactive  due  to  attempts  to 
initiate  simultaneous  accesses  to  the  same  storage  module.  It 
occurs  also  when  only  a  single  processor  is  applied,  when  it  repre- 
sents a  conflict  between  a  data  fetch  and  an  attempt  bv  the  overlap 
circuit  to  initiate  an  instruction  fetch  from  the  same  module. 


466  Part  5  |  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


STORAGE 
KILOCYCLES 

600 

^  550 

i=  500 


16  STORAGE  MODULES 


0  450 

CO 

CO  400 

UJ 

8  350 
£  300 
-I  250 
^  200 
150 
100' 
90 
80 
UJ  70 
p  60 
z  50 

1  40 
30 
20 
10 


INNER  LOOP  SIZE 

 «         2  EQUATIONS 

--a  3  EQUATIONS 

 4  EQUATIONS 

■•-«  5  EQUATIONS 


4  5  6  7  8  9  10  II  12  13  14  15  16 
NUMBER  OF  PROCESSORS 


Fig.  7.  Total  processor  and  throughput  times  in  electrical  network 
analysis— 16  storage  modules. 

Executive  interference  (col.  8)  represents  processor  hold-ups  due 
to  the  simultaneous  attempts  by  two  or  more  processors  to  access 
the  system  work-queues.  These  interferences  are  of  course  repre- 
sentative of  a  whole  class  of  effects  that  can  lead  to  performance 
degradation  in  parallel  processing  systems. 

In  Table  .3  interference  has  been  related  to  the  number  of 
interleaved  storage  modules  and  to  the  number  of  processors.  In 
an  actual  system  it  is  of  course  a  complex  fimction  of  the  number 
of  storage  modules,  of  the  degree  of  address  interleaving,  of  the 
relationship  between  active  jobs  and  the  degree  of  program  and 
data  sharing,  and  of  the  total  system  utilization  of  storage.  In 
optimizing  a  design,  the  numbers  of  processors  and  storage  mod- 
ules and  the  addressing  scheme  must  be  fixed  subject  to  constraints 
related  to  cost,  total  storage  capacity,  the  capacity  of  available 
storage  modules,  the  degree  of  availability  desired,  and  the  ex- 
pected nature  of  the  work  load.  Processor  utilization  of  storage 
alone  is  not  very  significant,  since  a  critical  factor  is  the  I/O 
storage  activity  present,  the  degree  of  storage  utilization  required 


to  get  program  and  data  into  the  high-speed  store  and  to  output 
results.  We  include  utilization  figures  for  these  executions  in  Table 
3,  to  aid  in  analysis  of  the  system  behavior  but  not  for  evaluation 
purposes. 

7.3.3  The  electrical  network  analysis  problem.  This  problem 
represents  the  solution  of  a  set  of  simultaneous  linear  equations, 
described  by  a  sparse  coefficient  matrix.  The  technique  used  for 
its  solution  on  the  executing  simulator  essentially  comprises  a 
rela.\ation  procedure.  Extensive  rims  have  been  made  using  a 
specific  thirty-six  node  network,  yielding  twenty-six  equations  with 
up  to  four  terms  in  each  equation. 

From  the  wealth  of  results  obtained  we  present  representative 
sets  that  indicate  some  general  trends  related  to  the  characteristics 
and  performance  of  the  parallel  processing  system.  Available  space 
will  not  permit,  however,  detailed  analysis  in  the  present  paper, 
nor  does  it  permit  a  discussion  of  the  equally  interesting  results 
obtained  concerning  speed  of  convergence,  in  particular,  and  other 


STORAGE 
KILOCYCLES 

600 

^  5501- 


o  450 

CO 

CO  400 


32  STORAGE  MODULES 


INNER  LOOP  SIZE 

 o         2  EQUATIONS 

— a-  3  EQUATIONS 

— _.  4  EQUATIONS 

-  5  EQUATIONS 


4  5  6  7  8  9  10  II  12  13  14  15  16 
NUMBER  OF  PROCESSORS 


Fig.  8.  Total  processor  and  throughput  times  in  electrical  network 

analysis— 32  storage  modules. 


Chapter  37  |  A  survey  of  problems  and  preliminary  results  concerning  parallel  processing  and  parallel  processors  467 


STORAGE 
KILOCYCLES 

600 

uj  550 

^  500 

450 


v>  400 


64  STORAGE  MODULES 


INNER  LOOP  SIZE 

 "         2  EQUATIONS 

-—a  3  EQUATIONS 

4  EQUATIONS 
■  5  EQUATIONS 


4  5  6  7  8  9  10  II  12  13  14  15  16 
NUMBER  OF  PROCESSORS 


Fig.  9.  Total  processor  and  throughput  times  in  electrical  network 
analysis— 64  storage  modules. 


effects  which  must  be  understood  within  the  framework  of  a 
numerical  analysis  of  the  relaxation  solutions. 

Figures  7,  8,  and  9  present  the  basic  performance  data,  through- 
put time,  and  total  processor  time,  for  a  total  of  one  hundred  and 
fortv-four  cases.  The  variables  are  the  number  of  processors  in  the 
system  {12  cases),  the  size  of  the  inner  loop  as  represented  bv  the 
number  of  currents  (from  2  to  5)  evaluated  in  the  loop,  and  the 
number  of  interleaved  storage  modules  (16,  32,  64). 

These  curves  clearly  indicate  the  reduction  in  throughput  time 
to  be  obtained  from  the  use  of  parallel  processing,  the  consequent 
increase  in  processor  cost  due  to  interferences  of  various  sorts,  the 
resultant  effect  of  diminishing  returns,  and  the  actual  increase  in 
throughput  time,  when  too  manv  processors  chase  too  few  equa- 
tions and  generally  get  seriously  "into  each  other's  way.  " 

For  the  smaller  inner  loops  and  when  interference  between 
processors  is  low,  total  processor  times  vary  somewhat  erratically. 
The  causes  for  this  are  related  to  the  rela.xation  pattern  and  the 
rate  of  convergence  in  each  case.  In  fact  there  appears  strong 


circumstantial  evidence  that  an  ad  hoc  procedure,  which  does  not 
guarantee  sequential  evaluation  of  the  equations,  improves  per- 
formance. This  point,  however,  requires  further  study. 

Figure  10  reproduces  some  of  the  results  of  the  previous  three 
figures  for  the  case  of  a  five-equation  inner  loop.  Table  4  lists  these 
same  results  as  a  percentage  of  the  time  using  one  processor  and 
compares  them  with  the  reciprocal  of  the  number  of  processors. 

Figure  1 1  indicates  storage  interference  and  parallel  processing 
overheads  as  a  function  of  the  number  of  processors,  with  storage 
modularity  again  a  parameter  and  an  inner  loop  again  comprising 


STORAGE 
KILOCYCLES 

280 
260 
^240 


(c  220 


</) 

v>  200 

UJ 

o 

O  180 

Q. 

-J  160 
g  140 

120 
100^ 
90 
80 
70 

tij 
2  60 
(- 

z  50 
q: 

40 
30 

20  h 


5  EQUATIONS 
IN  A  LOOP 


-•  16  STORAGE  MODULES 

32  STORAGE  MODULES  •/ 
•-    64  STORAGE  MODULES 


-  w 


J  I  I  L 


2   3  4   5  6  7  8  9  10  II  12  13  14  15  16 
NUMBER  OF  PROCESSORS 


Fig.  10.  Total  processor  and  throughput  times  in  electrical  network 
analysis  with  number  of  storage  modules  as  a  parameter. 


468  Part  5  |  The  PMS  level 


Section  3  |  Computers  for  multiprocessing  and  parallel  processing 


Table  4  Run  time  for  resistor  network  system  relative  to  the  run  time 
using  one  processor,  with  a  five  equation  inner  loop 


Relative  time  100 


Number  of      16  Storage      32  Storage      64  Storage 


processors 

modules 

modules 

niodulcs 

No.  of  processors 

1 

100% 

100% 

100% 

100% 

2 

52.8 

51.2 

51.2 

50.0 

4 

29.5 

27.9 

27.1 

25.0 

6 

22.4 

20.3 

19.5 

16.7 

7 

20.9 

17.9 

17.1 

14.3 

8 

19.2 

16.8 

15.8 

12.5 

9 

17.8 

15.2 

14.2 

11.1 

10 

17.6 

14.5 

13.7 

10.0 

11 

16.8 

13.9 

12.9 

9.1 

12 

17.5 

13.9 

13.0 

8.3 

14 

17.3 

13.2 

11.7 

7.2 

16 

17.7 

13.7 

11.7 

6.3 

the  evaluation  of  five  currents.  Storage  interference  has  previously 
been  defined.  The  parallel  processing  overhead  represents  as  a 
percentage  the  excess  of  total  number  of  storage  cycles  required 
for  execution,  excluding  storage  interference  cycles,  when  more 
than  one  processor  is  used,  relative  to  the  number  of  cycles  re- 
quired by  a  one-processor  execution. 


Fig.  11.  Storage  and  executive  interference. 


Fig.  12.  Storage  utilization  and  cost/performance  factors. 

Actual  counts  during  e.xecution  show  that  in  general  some 
sixty-seven  percent  of  store  access  are  instmction  fetches  in  this 
program  and  some  thirty-three  percent  are  data  fetches.  Thus 
incorporation  of  a  substantial  instruction  buffer  in  each  processor 
clearly  reduces  all  interference  by  an  order  of  magnitude,  since 
of  the  four  ways  in  which  a  storage  interference  can  occur,  only 
one — a  data  fetch  conflicting  with  a  data  fetch — remains  in  the 
inner  loop.  Moreover,  these  measurements  refer  to  a  processor  in 
which  arithmetic  speeds,  as  in  Table  2,  are  of  the  order  of  magni- 
tude of  a  memory  cycle  time,  which  implies  a  somewhat  powerful 
processor.  Thus  in  every  sense  the  interference  figures  are  worst 
case  results  which,  with  the  performance  curves  to  which  they 
relate,  support  the  view  that  storage  interference  is  not  a  serious 
obstacle  to  parallel  processing. 

The  four  contours  drawn  on  these  curves  represent  lines  of 
constant  storage  module-to-processor  ratio.  They  slope  slightly 
upward  due  to  the  statistical  Marbles  and  Boxes  [Rosenfeld,  1965] 
efi^ect  previously  referred  to. 

Figure  12  presents  two  sets  of  data,  based  on  the  five-equation 
line  loop.  The  upper  family  of  curves  relates  to  storage  utilization. 
The  reservations  made  at  the  end  of  Sec.  7.3.2,  with  reference  to 
the  significance  of  utilization  figures,  also  apply.  The  second  family 
of  curves  represents  a  first  attempt  at  estimating  the  relative 
quality  of  processing,  that  is,  some  fimction  of  a  cost/performance 


Chapter  37  j  A  survey  of  problems  and  preliminary  results  concerning  parallel  processing  and  parallel  processors  469 


factor.  Such  a  factor  is  intuitive  and  environment-sensitive,  de- 
pending on  the  relative  concern  for  speed  and  for  costs  of  various 
sorts.  For  the  present  data  we  have  chosen  to  displav  a  function: 


throughput  time  X  total  processor  time 

where  K  is  a  constant,  throughput  time  a  measure  of  the  speed 
of  computation,  and  total  processor  time  a  measure  of  the  cost. 

8.  Conclusion 

lu  this  paper  w  e  have  presented  some  thoughts  on  parallel  process- 
ing. In  particidar  we  have  chosen  to  surve\  the  topic  bv  including 
an  extensive  bibliography  and  some  of  the  results  of  our  work  in 
this  area.  The  discussion  has  had  to  be  brief,  but  our  intention 
has  been  to  convey  the  picture  of  the  potential  that  parallel 
processing  systems  offer  for  the  future  development  of  computing. 

The  key  to  successfid  exploitation  lies  in  a  new,  unified,  and 
scientific  approach  to  the  entire  problem  of  the  design  and  usage 
of  computing  systems.  The  development  of  large,  integrated  sys- 
tems raises  many  problems,  but  there  can  be  no  doubt  that  eco- 
nomic solutions  to  these  will  be  found.  Their  development  should 
comprise  a  significant  part  of  the  computer  svstem  architectural 
design  effort  of  the  next  few  years. 


.\ny  ultimate  evaluation  of  a  parallel  processing  system  within 
a  working  environment  depends  on  actual  operating  experience. 
This  in  turn  requires  the  existence  of  a  system  and  the  interest 
of  users.  Only  when  usable  systems  become  available  will  the 
concept  of  parallel  processing  in  integrated  svstems  be  accurately 
evaluated. 

References 

BlaaG64:  BrigH64;  ConwMft3;  CorbF65;  DennJWj:  DesniWM;  DreyPSS; 
Falk.\64;  GillS.58;  GregJ&3;  KatzJ66;  LehmM6.5;  Lein.\.59:  .McCiiJ65; 
.MiraVV'67;  NievJ64;  RoseJ6.5;  SchlH??:  ShedG66a.  b;  SlotD62;  SmitR64; 
PL  ' I  Language  Specification.  ForniC2<S-6.57 1 

Bibliography 

.\lle.\163:  AiiulaG62;  .\ndeJ62,  6.5:  .\rdeB66:  BaldF62:  BlaaG64:  BrigH64; 
Buch\V62:  BussB63;  CoddE62:  Coinf\V65;  ConwM(i3:  CorbF62.  65; 
Crit.\63:  DaleR65:  DcnnJ65.  66:  Desm\\'64:  DijkE65:  Dre>P.5.S:  EmsH63: 
EstrCeO,  6.3:  EwinR64:  Falk.\64:  Forg|6.5:  FranJoT:  GillSoS:  GlasE65; 
GregJ63:  IlelllieL  66:  KatzJ66:  KinsH&4:  K;niitD66:  LehmM6.3a.  6.31i,  65; 
LeinA.59;  L()iir\.59;  Mar(.-.\I63:  Mc(:aJ62:  McCuJ65;  MeadR6.3:  Mi!l\V63; 
MiraW6T:  J64:  Oss;iJ65:  Pcnii]62:  RoseJ65:  SchlH??:  SeebR63:  SenzD65; 
ShedG66ii,  66h:  SlotD62:  SinitR64:  S<iuiJ63:  StraC59;  \Vss\'65;  WirtN66: 
IB.\I  OS  M)  ri.  I  Ltim^iiHur  Sprrification.  Form  C  28-6571;  Pwc.  IFIPiml. 
"Symposium  on  .Multi-Programming"  196.3. 


Section  4 

Network  computers  and  computer 
networks 

The  RW-400  and  the  CDC  6600  are  actually  computer  networks 
by  our  definition  of  a  computer  (Chap.  2,  page  17).  Yet  because 
of  the  restrictions  on  the  quantity  and  location  of  the  compo- 
nents in  these  structures,  we  still  consider  them  to  be  com- 
puters. On  the  other  hand,  two  or  more  computers  which  are 
separated  physically,  yet  connected,  constitute  a  computer 
network.  Computer  networks  will  appear  in  the  future;  it  is 
important  to  understand  the  basis  for  them. 

The  RW-400— a  new  polymorphic  data  system 

Chapter  38  presents  the  RW-400  (also  called  the  AN/FSQ-27), 
a  later  version  of  the  Ramo-Wooldridge  RW-40  originally  de- 
signed in  1959.  The  diagram  (page  478)  gives  an  indication 
of  the  relationship  and  names  of  the  components.  The  PMS 
structure  in  Fig.  1  has  more  configuration  details.  At  least  six 
RW-400's  were  built  for  military  command  and  control  applica- 
tions (although  the  number  of  computers  of  a  type  in  existence 
has  little  to  do  with  a  machine's  worth  or  ability). 

The  RW-40  ISP  as  given  in  Appendix  1  of  Chap.  38  is  a 
good  example  of  a  processor  with  a  two-address  instruction  set. 
The  ISP  does  not  have  index  registers;  it  has  a  small  state 
consisting  of  the  accumulator  (A),  a  limited  extended  accumu- 
lator (B),  the  program  counter  (P),  and  about  6  state  bits.  The 
Pc  is  limited  by  its  ability  to  address  directly  only  a  1,024-word 
Mp.  The  ISP  is  undoubtedly  sufficient  for  solving  the  kinds  of 
problems  encountered  by  the  computer  and  compares  favorably 
with  Whirlwind  and  the  IBM  1800. 

The  RW-40  introduced  multiple  parts  for  reliability  [Roth- 
man,  1959].  Multiple  C's  (or  Mp— Pc  and  Mp— Pio)  are  provided 
for  redundancy  and  capacity.  However,  the  S('Central  Ex- 
change) which  provides  communication  among  the  C's  may  not 
have  redundant  parts.  The  multiple-computer  concept  can  be 
viewed  as  the  forerunner  to  our  present  computer  networks, 
in  which  the  central  switching  element  is  the  Telephone  Ex- 
change. Over  a  longer  time  span,  the  RW-400  may  be  most 
significant  as  a  pioneer.  However,  the  whole  system,  with  the 
exception  of  the  small  Mp's,  is  nicely  designed.  The  problem 
of  low  speed  T(typewriter,  display)'s  is  handled  well  by  trans- 
ferring data  from  Mp— Pc  to  Ms(drum)  for  concurrent  and 


independent  T  and  P  activity.  Similar  solutions  are  common 
for  managing  T  activity  by  using  an  M,  local  to  particular  T's, 
and  local  C's. 

The  structure  should  be  compared  with  the  CDC  6600  (Chap. 
39)  and  the  network  examples  in  Chap.  40. 

The  CDC  6400,  6500,  6600,  6416,  and  7600 

The  CDC  6600  development  began  in  1960,  using  high-speed 
transistors  and  discrete  components  of  the  second  generation. 
The  first  6600  was  delivered  in  September,  1964.  Subsequent 
compatible  successors  included  the  6400,  in  April,  1966,  which 
was  implemented  as  a  conventional  Pc(a  single  shared  arith- 
metic function  unit  instead  of  the  10  D's);  the  6500  in  October, 
1967,  which  uses  two  6400  Pc's;  and  the  6416  in  1966,  which 
has  only  peripheral  and  control  processors.  The  first  7600, 
which  is  nearly  compatible,  was  delivered  in  1969.  The  dual 
processor  6700,  consisting  of  two  6600  Pc's  was  introduced 
in  October,  1969.  Subsequent  modifications  to  the  series  in 
1969  included  the  extension  to  20  peripheral  and  control 
processors  with  24  channels.  CDC  also  marketed  a  6400  with 
a  smaller  number  of  peripheral  and  control  processors  (e.g., 
6415-7  with  7).  Reducing  the  maximum  POP  number  to  7 
also  reduced  the  overall  purchase  cost  by  approximately  $56,000 
per  processor. 

The  computer  organization,  technology,  and  construction 
are  described  in  Chap.  39.  ISP  descriptions  for  both  the  Pc  and 
Pc  ('Peripheral  and  Control  Processors/ POP)  are  given  in  Ap- 
pendices 1  and  2  of  Chap.  39. 

To  obtain  the  very  high  logic  speeds,  the  components  are 
placed  close  together.  The  logic  cards  use  a  cordwood-type 
construction.  The  logic  is  direct-coupled  transistor  logic,  with 
5  nanoseconds  propagation  time  and  a  clock  of  25  nano- 
seconds. The  fundamental  minor  cycle  is  100  nanoseconds  and 
the  major  cycle  is  1,000  nanoseconds,  also  the  memory  cycle 
time.  Since  the  component  density  is  high  (about  500,000 
transistors  in  the  6600),  the  logic  is  cooled  by  conduction  to 
a  plate  with  Freon  circulating  through  it. 

This  series  is  interesting  from  many  aspects.  It  has  remained 
the  fastest  operational  computer  for  many  years.  Its  large 


470 


Section  4  '  Network  computers  and  computer  networks  471 


Mp=  Pc'  1 


Mp-*        Pc  — 

Mp^   Pio_ 

Hp   Pro  


' Central 
Exchange ; 
5  us/w, 

taccess  :  65  liS  ; 
concurrency : I  28 


' Per  i  phera 1 
Buffer;  drum 
?192  w 


I — K  MsPdrum;  0  ~  I?  nts^ 

:         :  Ll6  us/w:  8192  wj 
—  K— S-Msfmaqnet  i  c  tape:   150  In/s;' 
:         :     200  char/in:    .5  w/char; 

J02't  w/block;  width:   1  In 

^       yOioes,  cards,  paper  tape)- 

K  T('Master  Console)- 


#1:32;  full  duplex;  Flexo- 
writer.  Teletype,  analog, 
plot ;   30  char/s 


' D  i  sp 1  ay 
Buffer:  drum: 
8192  w 


*1 :8;  console;  (CRT: 
display) ,   (1 ight;  pen) 
keyboard,  joy  stick 


Pc  (2  address/ inst  ruct  ion  1  Mps  {-^  Jri)  :   technology  :  t  rans  i  s  tor :  descendants  : RW-^400  ,  AN/FSQ  27) 
■Mp(corei   lOus/w;   102llw;    (26,2  parity)  b/w) 
'  K( 'Peripheral  Buffer) 

K( 'Display  Buffer) 


Fig.  1.  RW-40  (Polymorphic)  PIVIS  diagram. 


component  count  almost  Implies  It  cannot  exist  as  an  opera- 
tional entity.  Thius  it  is  a  tribute  to  an  organization,  and  the 
project  leader-designer  Seymour  Cray,  that  a  large  number 
exist.  There  are  sufficiently  high  data  bandwidths  within  the 
system  so  that  it  remains  balanced  for  most  job  mixes  (an 
uncommon  feature  in  large  C's).  It  has  high  performance 
Ms. disks  and  T.displays  to  avoid  bottlenecks.  The  Pc's  ISP  is 
a  nice  variation  of  the  general-registers  processor  and  allows 
for  very  efficient  encoding  of  programs.  The  Pc  is  nicely  multi- 
programmed  and  can  be  switched  from  job  to  job  more  quickly 
than  any  other  computer.  Ten  smaller  C's  control  the  main 
Pc  and  allow  it  to  spend  time  on  useful  (billable)  work  rather 
than  its  own  administration.  The  independent  multiple  data 
operators  in  the  6600  Increase  the  speed  by  at  least  2'.,  times 
over  a  6400  which  has  a  shared  D.  Finally,  it  realizes  the  10  C's 
in  a  unique,  interesting,  and  efficient  manner.  Not  many  com- 
puter systems  can  claim  half  as  many  innovations. 


PMS  structure 

A  simplified  PMS  structure  of  the  C('6400,  '6600)  is  given  in 
Fig.  2.  Here  we  see  the  C(io;  ^  1:10)  each  of  which  can  access 
the  central  computer  (Cc)  primary  memory  (Mp).  Figure  2  shows 


Hp (60  b/w)- 


Cc 

Mp(l2  b/w)- 
Mp(l2  b/w)_ 


-Pc(#l): 


CioT^l :10) 


pe  r  i  phe  ry 


Fig.  2.  CDC  6600  PMS  diagram  (simplified). 


472  Part  5  j  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


why  we  consider  the  6600  to  be  fundamentally  a  network.  Each 
Cio  (actually  a  general-purpose,  12-bit  C)  can  easily  serve  the 
specialized  Pio  function  for  Cc.  The  Mp  of  Cc  is  an  Ms  for  a  Cio, 
of  course.  By  having  a  powerful  Cio,  more  complex  input-output 
tasks  can  be  handled  without  Cc  intervention.  These  tasks  can 
include  data-type  conversion,  error  recovery,  etc.  The  K's  which 
are  connected  to  a  Cio  can  also  be  less  complex.  Figure  2  has 
about  the  same  information  as  Thorton's  Fig.  1  block  diagram 
(Chap.  39). 

A  detailed  PMS  diagram  for  the  C('6400,  '6416,  '6500,  and 
'6600)  is  given  in  Fig.  3.  The  interesting  structural  aspects  can 
be  seen  from  this  diagram.  The  four  configurations,  6400  — 
6600,  are  included  just  by  considering  the  pertinent  parts  of 
the  structure.  That  is,  a  6416  has  no  large  Pc;  a  6400  has  a  sin- 
gle straightforward  Pc;  a  6500  has  two  Pc's;  and  the  6600  has 
a  single  powerful  Pc.  The  6600  Pc  has  10  D's,  so  that  several 
parts  of  a  single  instruction  stream  can  be  interpreted  in  paral- 
lel. A  6600  Pc  also  has  considerable  M. buffer  to  hold  instruc- 
tions so  that  Pc  need  not  wait  for  Mp  fetches. 

The  implementation  of  the  10  Cio's  can  be  seen  from  the 
PMS  diagram  (Fig.  3).  Here,  only  one  physical  processor  is  used 
on  a  time-shared  basis.  Each  0.1  /.is  a  new  logical  P  is  processed 
by  the  physical  P.  The  10  Mp's  are  phased  so  that  a  new  access 
occurs  each  0.1  fis.  The  10  Mp's  are  always  busy.  Thus  the  i.rate 
is  10  X  12  b/jxs  or  120  megabits/s.  This  process  of  shifting 
a  new  Pc  state  into  position  each  0.1  fis  has  been  likened  to 
a  barrel  by  CDC.  A  diagram  of  the  process  is  shown  in  Fig.  4. 

The  T's,  K's,  and  M's  are  not  given,  although  it  should  be 
mentioned  that  the  following  units  are  rather  unique:  a  K  for 
the  management  of  64  telegraph  lines  to  be  connected  to  a 
Cio;  an  Ms(disk)  with  four  simultaneous  access  ports,  each  at 
1.68  megachar/s  data  transfer  rate,  and  a  capacity  of  168 
megachar;  an  Ms(magnetic  tape)  with  a  K(#  1:4)  and  S  to  allow 
simultaneous  transfers  to  4  Ms;  the  T  (display)  for  monitoring 
the  system's  operation;  K's  to  other  C's  and  Ms's;  and  con- 
ventional T(card  reader,  punch,  line  printer,  etc.). 

ISP 

The  ISP  description  of  the  Pc  is  given  in  Appendix  1,  Chap.  39. 
The  Pc  has  a  very  clean,  straightforward  scientific-calculation- 
oriented  ISP.  We  can  consider  it  a  variation  on  the  general- 
register  structure  because  the  Pc  state  has  three  sets  of  general 
registers.  Their  use  is  explained  both  in  Chap.  39  and  its  Ap- 
pendix 1.  This  structure  assumes  that  a  program  consists  of 
several  read  accesses  to  a  large  array(s),  a  large  number  of 
operations  on  these  accessed  elements,  followed  by  occasional 


write  accesses  to  store  results.  We  would  agree  that  this  is  a 
valid  assumption  for  scientific  programs  (e.g.,  look  at  a  FOR- 
TRAN arithmetic  statement),  and  it  is  probably  valid  for  most 
other  programs  as  well. 

Cc  has  provisions  for  multiprogramming  in  the  form  of  a 
protection  and  relocation  address.  The  mapping  is  given  in  the 
ISP  description  for  both  Mp  and  Ms('Extended  Core  Storage- 
/ECS). 

Appendix  2,  Chap.  39,  has  an  ISP  description  of  the  PCP. 
Appendix  2  includes  a  figure  which  shows  the  instruction  de- 
coding and  execution  as  well.  The  6600  PCP  is  about  the  same 
as  the  early  CDC  160.  The  PCP  has  an  18-bit  A  register  because 
it  has  to  process  addresses  for  the  large  Cc. 

One  interesting  aspect  of  the  6600  which  we  question  is  the 
lack  of  communication  among  all  components  at  the  ISP  (pro- 
gramming) level.  When  Pc  stops,  it  has  no  way  of  explicitly 
informing  any  other  components.  There  are  no  interprocessor 
interrupts.  An  io  device  cannot  interrupt  a  Pio,  nor  can  Pio's 
communicate  with  one  another  except  by  polling.  The  state 
switching  for  Pc  is,  however,  elegant,  since  a  Pio  can  request 
Pc  to  stop  a  job,  store  Mps,  and  resume  a  new  task  in  one 
instruction.  (The  t.save  +  t. restore  ~  2  /iS.) 

The  operating  system 

The  Cio's  functions  are  data  transmission  between  a  peripheral 
device  and  the  large  Cc  via  the  Cio's  Mp  with  some  data  trans- 
formation or  conversions;  complete  task  management,  includ- 
ing initiation,  termination,  and  error  handling;  and  manage- 
ment of  Pc.  The  Cio's  perform  in  about  the  same  manner  as 
the  CCAttached  Support  Processor)  in  the  N('360  ASP)  (Chap. 
40,  page  506).  The  operating-system  software  is  managed  by 
a  single  fixed  Cio.  The  remaining  nine  Cio's  are  free,  and  as 
io  tasks  arise  in  the  system,  the  Cio's  assign  themselves  to 
particular  tasks,  carry  out  the  tasks,  and  then  free  themselves 
to  take  on  other  tasks.  The  operating-system  software  resides 
in  Mp(Pc)  (that  is,  Cc)  accessible  to  all  Cio's  and  includes: 

1  The  variables  which  determine  the  state  of  a  particular 
job,  e.g.,  data  pointers  to  Ms(disk,  'ECS),  running  time, 
a  list  of  jobs  to  do,  etc. 

2  Programs  for  the  Cio's 

a  Parts  of  the  operating  system  used  by  the  Cio  re- 
sponsible for  the  system  management 

h  10  management  programs  (or  programs  to  get  the 
task  management  program  from  Ms)  which  the  Cio's 
use 


Section  4  |  Network  computers  and  computer  networks  473 


M('Barrel)  working:   10  w;  51   b/w;  0.1  (is/w) 


Hp(#0:9)l   S=- 


#0:9;  'Peripheral 
and  Control  Pro- 
cessor/PCP 


Mp* (#0:31)- 


C  ( 'Cent  ral ) 


Spl  :  I2T] 
[fixed  J 


'Read  Pyramid;  buffer: 
12  b/w:  M(workinq; 
(l+J+S+li+S) :   12  b/w); 
_.2  liS/w) 

'Write  Pyramid;  buffer; 
12  b/w;  H(workInq 

(5+'i+3+2+l)  w:  12  b/w; 

.2  us/w 


— K  'Extended  Core  Coupler; 
L 1  ns/w;  60  b/w 


TCOead  Start  Console)- 
L(l  us/w;   12  b/w)- 
T(#l  :2;  CRT;  display)- 
T{keyboard)- 


-|-S(1|  K:   16  Ms)— H5°  (^0: 15) 
L(*2,3,'<:   to: 'Extended  Core  Coupler) 


LPc° 


'Mp(core;  1  .0  us/w;  1(096  w;  12  b/w) 
^S(time  multiplex;   .1  u.s/w:   12  b/w) 

^ Pc ( 'Peripheral  and  Control  Processor;  #0:9:  time  multiplex:.!  us/w:   1  address/ i ns t ruct ion : 

12  b/wt  MpsC'Program  Counter,  Accumulator)  1,2  w/i ns truct ion) 
*Mp(core;   LOjis/w;  11096  w:    (5  x  12)  b/w) 
^S(time  multiplex:  0.1  us/w:  60  b/w) 

''MsCExtended  Core  Storage/ECS;   3.2  us/w:    (125952  /  R)  w:    (R  x    (60.   1  parity))  b/w) 

See  Chapter  39  for  operation. 
^Only  present   in  CDC  6500 

'No  C('Central)  in  CDC  6'(16;  CDC  650O  and  CDC  6400  do  not  have  K  (' Scoreboard) ,  separate  D's. 
and  M (' 1 ns truct ion  Stack). 

Pc('6600:   15,  30  b/i  ns  truct  ion  :  technol  ogy  :  t  rans  i  s  tor  :  ~  I96'i:  data:  5  i  ,bv  ,w,s  f  ,df )   :  = 


pMp5(flip  flop:  ~I6  w)  S('Switchboard)- 


I 


1 

K ( i  nterpre ter )- 


- K ( ' Scoreboa  rd) 
M.worki  ng 


L  M. instructior^' Instruction  Stack; 

content  addressable; 
_f 1 ip  flop;  8  w:  60  b/w 


_D('Shift) 

—  DC  Boolean) 

—  D(#l :  2:    ' Increment) 

—  DC  Branch) 

— D('Add;   0.3  us) 
DC  Long  Add) 
D(f\:2:  Multiply:   1  us) 
D('Divide:   2.9  us) 


Fig.  3.  CDC  6400,  6416,  6500,  and  6600  PMS  diagram. 


474  Part  5     The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


Fig.  4.  CDC  6600  peripheral  and  control  processors.  (Courtesy  of  Control  Data  Corporation.) 


Section  4  |  Network  computers  and  computer  networks 


In  a  typical  system,  one  might  expect  to  find  the  following 
assignment  of  PCP's  to  be: 


1 

Operating-system  execution,  including  scheduling  and 

management  of  Cc  and  all  Cio's 

2 

Display  of  job  status  data  on  T(display) 

3 

Ms(disk)  transfer  management 

4 

T(printers,  card  reader,  card  punch) 

5 

L(#  1:3;  to:C, satellite) 

6 

MsCmagnetic  tape) 

7 

T(64  Teletypes) 

8 

Free  to  be  used  with  Ms(disk)  and  Ms(magnetic  tape) 

9 

Free 

10 

Free 

CDC  7600 

The  CDC  7600  system  is  an  upward  compatible  member  of  the 
CDC  6000  series.  Although  the  main  Pc  in  the  7600  is  compati- 
ble with  the  main  Pc  of  the  6600,  instructions  have  been  added 
for  controlling  the  io  section  and  for  communicating  between 
Large  Core  Memories /LCM  and  Small  Core  Memory/SCM.  It  is 
expected  to  compute  at  an  average  rate  of  four  to  six  times 
a  C('6600). 

The  PMS  structure  (Fig.  5)  is  substantially  different  from  that 
of  the  6600.  The  C('7600  Peripheral  Processing  Unit/PPU), 
unlike  the  C('6600  Peripheral  and  Control  Processor)'s,  has  a 
loose  coupling  with  the  main  C.  The  PPU's  are  under  control 
of  the  main  C  when  transferring  words  into  SCM  via  K('lnput- 
Output  Section).  The  15  C('PPU)'s  have  8  input/output  chan- 
nels. These  channels,  which  can  run  concurrently,  provide  the 
link  between  C('PPU)  and  peripheral  Ms's  and  T's.  Some  of  the 
PPU's  are  located  in  the  same  physical  space  as  the  Pc. 


M5(,«0:7)'-j— S   K(M.buffe 

Hp(#0:3l)=— S=-|-P,c= 


core  tc  aovP  trayzs^er 


'Input  Output  Section; 
M(buffer:   15  w:  60  b/w) 
55  ns/w:  60  b/w; 


 sftime  mul  tiplexTI-p  C*      :  15;  'PPU)« 


sPt  i  me  mu  1  t  i  p 
|_I5  C('PPU) 


Basic  N('CDC  7600) 


MsCLarge  Core  Memory/LCM:   1,760  ^is/w:    {M/B)  kw:    (60  x   1)  b/w) 
MpCSmall   Core  Memory/SCM:    .275  u5/w:  2  kw;  60  b/w) 
S(time  multiplexed:  27,5  ns/w;  60  b/w) 
Cj 'Peripheral  Processing  Unit/PPU)   ;  = 
Mpr#0:l;  275  ns/wTl  —  S    .    PcPi   address/ i  ns  t  rue  t  i  on  :   1  .^2  w/ i  ns  t  rue  t  ion  ; 
[_20l|8  w;    12b/wJ  [mP5(~2.5  w) 


L-Kio(J'0:7;   '10  Channel)  -L(to;  K)- 


-  K-T  |Ms  |C  (Central  )- 


h^psPfl  ip  flop;  27.5  ns/w 
~  16  w;   60  b/w 


M, working;  Instruction 

interpreter 

'Inst  rue t  ion  Stack ; 

flip  f loD ;  27 . 5  ns/w 

12  w;   60  b/w 


D( 'Long  Add) 

D(' Increment) 
—  0 ( ' Popul a t ion  Count) 

D( 'Boolean) 

D('Shift) 

D ( ' Normal i  ze) 

DCFloating  Add) 
— D('FloatIng  Multiply) 

D (' Float  i  ng  Di vi  de) 


Fig.  5.  CDC  7600  computer  PMS  diagram. 


476  Part  5  |  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


The  7600  Pc  can  be  interrupted  by  a  clock,  the  PPU's,  and 
trap  condition  within  the  Pc.  A  breakpoint  address,  BPA,  can 
be  set  up  within  Pc  such  that,  on  the  program  reaching  BPA, 
a  trap  is  initiated.  This  interruption  scheme  is  in  contrast  to 
that  of  the  6600,  which  could  not  be  interrupted  or  trapped. 
The  7600  interrupt  may  be  a  reaction  to  the  lack  of  intercom- 
munication in  the  6600. 

Conclusions 

Although  the  6600  was  somewhat  behind  its  announced  delivery 
schedule  and  represented  a  significant  drain  on  the  financial 
resources  of  CDC,  it  is  now  clear  that  it  is  a  successful  product. 


There  have  been  instances  of  very  large  computers  not  being 
carried  to  completion  either  for  financial  or  technical  reasons. 
The  6600  seems  to  be  the  first  large  computer  to  achieve  these 
marks  of  success.  Here  we  are  interested  in  the  6600  because 
it  has  held  the  "world's  largest  computer"  title  for  so  long. 

Computer-network  examples 

In  Chap,  40,  we  present  examples  of  seven  computer  networks. 
There  is  a  dearth  of  both  computer  networks  and  of  papers  on 
computer  networks. 

This  chapter  takes  examples  from  papers  and  from  knowl- 
edge of  several  existing  or  proposed  networks. 


Chapter  38 


The  RW-400— a  new  polymorphic 
data  system  1 

R.  E.  Poller 

Summary  The  RW-400  Data  System,  based  upon  modularly  constmcted, 
independently  operating  and  flexibly  connected  components,  is  the  logically 
evolved  successor  to  conventional  computer  designs.  It  provides  the  means 
by  which  information  processing  requirements  can  be  met  with  equipment 
capable  of  producing  timely  results  at  a  cost  commensurate  with  problem 
economic  value.  System  obsolescence  is  minimized  bv  the  e.\pandabilit\  in 
numbers  and  types  of  processing  modides.  Real  time  reliability  is  a.ssured 
by  component  duplication  at  minimum  cost  and  by  the  advanced  design 
techniques  employed  in  the  system's  manufacture.  Man-machine  comuui- 
nication  facilities  are  program  controlled  for  niavimum  flexibility.  Parallel 
processing  and  parallel  information  handling  modules  increase  the  system's 
speed  and  adaptabilitv  when  handling  complex  computing  workloads.  This 
polymorphic  design  tnily  represents  an  extension  of  man's  intellect  through 
electronics. 

The  RW-400  Data  System  is  a  new  design  concept.  It  was  devel- 
oped to  meet  the  increasing  demand  for  information  processing 
equipment  with  adaptabilitv,  real-time  reliability  and  power  to 
cope  with  continuoiislv-changing  information  handling  require- 
ments. It  is  a  polvmorphic  system  including  a  varietv  of  fimction- 
ally-independent  modules.  These  are  interconnectable  through  a 
program-controlled  electronic  switching  center.  Many  pairs  of 
modules  mav  be  independently  connected,  disconnected,  and  re- 
connected, in  microseconds  if  need  be,  to  meet  continuously- 
varying  processing  requirements.  The  system  can  assume  whatever 
configuration  is  needed  to  handle  problems  of  the  moment.  Hence 
it  is  best  characterized  by  the  term  "polymorphic  " — having  many 
shapes. 

Rapid,  program-controlled  switching  of  many  pairs  of  func- 
tionally-independent modules  permits  nondisruptive  system  ex- 
pandability, operating  reliability,  simultaneous  multi-problem 
processing  capability,  and  man-machine  intercommunication 
feasibility.  These  are  only  partial!)  found  in  computers  of  conven- 
tional design. 

Computer  users  have  been  forced  heretofore  to  match  problems 
to  computer  limitations.  Problem  changes  posed  serious  reorien- 
tation and  reprogramming  difficulties.  Changes  from  one  computer 

^Datamation,  vol.  6.  no.  1,  pp.  8-14.  Januan-  February.  1960. 


to  another  model,  due  to  growth  in  applications,  often  resulted 
in  large  expenditures  of  time  and  money.  During  maintenance  or 
malhmction  of  a  conventional  computer  its  entire  processing 
capacity  is  shut  down.  Real  time  processing  reliability  cannot  be 
maintained  on  an  around-the-clock  basis.  The  conventional  ma- 
chine must  process  its  problems  serially.  This  serious  limitation 
is  only  partially  alleviated  by  time-sharing  or  computing-ele- 
ment-doubling designs.  The  high  cost-per-hoiir  of  conventional 
computer  operation  rules  out  direct  man-machine  intercommuni- 
cation during  other  than  emergencN  situations. 

The  radically-new  polymorphic  design  concept  of  the  R\\'-4(K) 
Data  System  was  evolved  by  Ramo-Wooldridge  engineers  to  pro- 
vide a  practical  solution  to  those  information  processing  problems 
now  inadequately  handled  by  conventional  computer  designs.  The 
R\V-400  is  a  powerful  new  tool  in  the  field  of  intellectronics — the 
extension  of  man  s  intellect  b)  electronics. 

System  description 

The  R\\  -400  Data  S\  stem  contains  an  optional  number  and  variety 
of  fimctionally-independent  modules.  These  communicate  via  a 
central  electronic  switching  exchange.  Each  module  is  designed, 
within  practical  economic  and  functional  limits,  to  maximize 
system  adaptabilitv  over  a  wide  range  of  problem  types  and  sizes. 
This  new  design  embodies  the  latest  proven  electronic  design 
techniques,  assuring  high  processing  speeds  and  high  equipment 
reliability.  The  R\V-400's  modularity  assures  reliable,  round-the- 
clock  processing  of  information  with  controllable  computing  ca- 
pacity degradation  during  module  maintenance  or  malfunction. 
Practical  man-machine  intercommunication  is  achieved  in  the 
R\\'-40()  system  b\  use  of  program-controlled  information  display 
and  interrogation  consoles. 

Figure  I  shows  the  over-all  system  design.  Modules  of  various 
types  communicate  through  a  central  exchange  switching  center. 
Computing  and  buffering  modules  provide  control  for  the  system. 
These  modules  are  self-controlled  and  make  possible  completely 
independent  processing  of  two  or  more  problems.  One  of  the 
computer  modules  ma\'  be  designated  the  master  computer  and 


477 


Part  5  I  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


CONTROLLING 


INTERROGATION 


4- 


I  I 


i  t 


SWITCHING  CENTER 


AUXILIARY  STORAGE 


INPUT  OUTPUT 


Fig.  1.  The  RW-400  data  system. 


in  this  role  initiates  and  monitors  actions  of  the  entire  system.  An 
alert-interrupt  network  is  provided  to  allow  coordinated  system 
action.  Therefore,  the  system  as  applied  to  given  information 
processing  problems  may  change  on  a  short  range  (microsecond) 
basis,  thus  providing,  through  programming,  a  self-organizing 
aspect  to  the  system.  In  addition,  the  system  may  change  through 
the  years  as  the  applications  change.  The  most  efficient  and  eco- 
nomical complement  of  equipment  is  applied  to  the  problem  at 
all  times. 

An  RW-4()0  system  is  built  around  an  expandable  Central 
Exchange  (CX)  to  which  a  number  of  primary  modules  may  be 
attached.  These  are:  Computer  Modules  (CM);  self-instn.icted 
Buffer  Modules  (BM);  Magnetic  Tape  Modules  (TM);  Magnetic 
Drum  Modules  (DM);  Peripheral  Buffer  Modules  (PB);  and 
console  communication  Display  Buffer  Modules  (DB).  How  many 
modules  are  put  together  in  a  system  is  entirely  a  function  of 
system  application.  In  addition  to  primary  system  modules, 
punched  card,  punched  tape,  high  speed  printing  and  control 
console  devices  are  available.  These  handle  nominal  svstem  in- 


put/output requirements.  Additional  man-machine  communica- 
tion devices  such  as  interrogation,  display  and  control  consoles, 
may  be  included  in  the  system  as  problem  requirements  dictate. 
A  Tape  Adapter  (TA)  module  is  available  to  provide  compatibility 
with  magnetic  tape  of  other  computers.  Information  generated  at 
Flexowriter  inquiry  and  recording  stations  may  be  directly  re- 
ceived by  the  system  via  the  Peripheral  Buffer  Module.  This  latter 
module  also  buffers  the  receipt  of  TWX  and  punched  tape  infor- 
mation. 

The  way  in  which  a  particular  RW-40()  Data  System  fimctions 
depends  on  the  number  and  type  of  each  module  included.  It  may 
initially  be  composed  of  the  minimum  number  and  variety  of 
modules  needed  to  do  a  small  problem  or  the  initial  part  of  some 
large  but  yet-to-be-defined  problem.  Such  a  system  would  work 
much  like  a  conventional  computer.  It  would  probably  include 
a  buffer  module  and  thus  have  a  parallel  data  handling  capability 
not  found  in  the  conventional  design  at  a  comparable  price.  The 
initial  system  installation  may  then  be  augmented  by  the  timely 
addition  of  modules. 


Chapter  38     The  RW  400-a  new  polymorphic  data  system  479 


A  buffer  module  (BM)  has  the  capaliility  to  control  its  acquisi- 
tion and  dissemination  of  information  independently.  The  buffer 
provides  a  computer  module  with  parallel  data  handling  capability 
without  complicating  the  problem  processing  program  with  the 
conventional  intermixture  of  arithmetic  and  housekeeping  in- 
structions. Information  previously  generated  by  the  processing 
program  may  be  appropriately  disposed  of  within  the  system  while 
processing  continues.  Data  needed  at  a  subsecjuent  time  in  the 
processing  mav  be  retrieved  from  system  storage  in  advance  of 
need  while  processing  progresses.  The  simultaneity  of  these  oper- 
ations not  only  materially  increases  over-all  processing  speed  but 
also  increases  the  practical  utility  of  the  less  costly  types  of  in- 
ternal system  storage  such  as  a  magnetic  tape. 

The  computer  (CM)  or  buffer  (BM)  modules,  when  acting  in 
a  controlling  capacity,  may  initiate  connection  to  an  information 
storage  or  handling  module  during  that  part  of  the  processing 
program  when  the  two  can  work  profitably  in  unison.  The  pair 
of  modules  thus  interconnected  neither  affect  nor  are  affected  b\ 
other  modules.  Logical  interlocks  prevent  unwanted  cross  talk 
among  modules.  An  intermodule  communication  system  lets  con- 
trolling modules  signal  status  or  alert  other  such  modules  of  their 
need  to  communicate.  The  decision  by  a  module  receiving  an  alert 
signal  to  permit  interruption  or  to  proceed  is  optional  with 
that  module.  The  optional  interrupt  feature  is  that  needed  to 
make  the  often-discussed  but  seldom-used  program  interrupt 
capability  both  useful  and  practical.  Programs  mav  thus  permit 
interruptions  onh'  at  con\enient  points  in  the  processing 
sequence. 

Modules  may  be  assigned,  under  program  control,  to  work 
together  on  a  problem  in  proportion  to  its  needs.  As  soon  as  a 
module's  fimction  is  complete  for  a  given  problem,  that  module 
may  be  released  for  reassignment  to  some  other  task.  The  svstem 
is  thus  self-controlled  to  match  processing  capacity  to  each  prob- 
lem for  the  time  necessary  to  do  the  job.  Full  system  capacity  mav 
be  brought  to  bear  upon  a  very  large  problem  when  needed.  This 
capacity  may  be  apportioned  among  a  number  of  smaller  problems 
for  simultaneous  processing,  program  compilation,  program 
checkout,  module  maintenance  etc.,  when  it  is  not  needed  for 
ma.ximum  svstem  effort. 

From  the  preceding  system  description,  it  is  apparent  that  such 
equipment  can  be  expanded  from  a  modest  initial  installation  into 
a  very  powerful  and  comprehensive  information  processing  cen- 
ter as  requirements  warrant.  More  specific  descriptions  of  prin- 
cipal system  modules  follow  to  give  the  reader  a  better  feel 
for  how  this  system  might  perform  his  information  processing 
work. 


The  functional  modules 

The  key  to  appreciative  understanding  of  the  power  of  the  RW-40() 
lies  in  knowledge  of  intermodule  connection.  It  is  appropriate  to 
describe  the  Central  Exchange  (CX)  unit  first,  then  follow  with 
descriptions  of  the  various  modules. 

The  central  exchange 

The  Central  Exchange  performs  the  vital  fimction  of  intercon- 
necting a  pair  of  modules  whenever  reijuusted  to  do  so  by  either 
a  computer  or  a  buffer  module.  .Since  internal  programmed  control 
is  only  possible  within  a  computer  or  a  buffer  module,  one  of  the 
interconnected  pair  of  modules  must  be  either  a  computer  or  a 
buffer.  The  time  in  which  any  connection  may  be  made  or  broken 
is  about  65  microseconds.  .\n  exchange  has  basic  capacity  to 
connect  any  of  16  computer  or  buffer  modules  to  any  of  64  auxili- 
ary function  modules.  There  is  nothing  sacred  about  the  number 
16  since  it  is  possible  to  extend  the  CX  module's  interconnection 
matrix  through  design  modification  when  need  arises.  The  CX  is 
an  expandable,  program-controlled,  electronic  switching  center 
capable  of  connecting  or  disconnecting  any  available  pair  of 
modules  in  roughly  the  time  of  one  computer  instruction  e.xecu- 
tion.  Figure  2  illustrates  the  permissible  module  interconnections 
within  the  Central  Exchange. 

Every  intersection  on  the  illustration  represents  a  possible 
connection  between  modules.  The  "x-ed  "  intersections  indicate 
t\  pical  connections  in  force  at  any  point  in  time.  The  control  logic 
of  the  CX  module's  connection  table  prevents  more  than  one 
interconnection  on  any  horizontal  (controlling)  or  vertical  (con- 
trolled) data  path  representation  on  the  diagram.  When  connec- 
tion is  requested  of  the  Central  Exchange  while  one  of  the  re- 
quired modules  is  already  carrying  out  a  previous  assignment,  the 
recjuesting  module  can  be  programmed  to  sense  this  condition  and 
wait  until  connection  can  be  made  without  interference.  Should 
waiting  be  undesirable,  the  requesting  module  can  go  on  about 
its  business  and  check  back  later  to  see  when  the  desired  connec- 
tion can  be  made.  There  is  an  implication  here,  of  course,  that 
knowing  the  kind  of  a  system  he  is  dealing  with,  a  programmer 
requests  connections  in  advance  of  need  whenever  possible. 

Provision  for  master-slave  control  is  included  via  an  .Assignment 
Matrix  established  within  the  CX  module  by  a  computer  module 
previously  assigned  to  master  status.  Such  a  provision  is  necessary 
to  preclude  inadvertent  connection  requests  from  unchecked 
programs  or  malfunctioning  control  modules  from  affecting  sets 
of  modules  simultaneously  processing  another  problem.  Connection 
requests  are  therefore  essentially  filtered  through  both  an  assign- 
ment and  an  interconnection  validity  matrix  prior  to  being  acted 


Part  5  I  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


upon  by  the  Central  Exchange.  The  computer  module  manually 
assigned  to  master  status  is  the  only  one  permitted  to  cause  the 
interconnection  of  a  pair  of  modules  which  does  not  include  itself. 

The  computer  module  (See  Fig.  3) 

The  Computer  Module  (CM)  is  a  self-sufficient,  general  purpose, 
two-address,  parallel  word,  fixed  point,  random  access  computer. 
Its  internal  magnetic  core  memory  has  a  capacity  of  1024  words. 
A  computer  word  consists  of  26  information  bits  and  2  parity  bits. 
Each  parity  bit  is  associated  with  the  13-bit  half  word  transferred 
in  parallel  via  the  Central  Exchange  to  other  system  modules.  The 
instruction  repertoire  of  the  CM  consists  of  38  primary  instr\ictions 
whose  various  modes  effectively  result  in  over  .300  different  oper- 
ations. Of  the  39  available  CM-400  instmctions.  24  may  be  classi- 
fied as  "arithmetic"  and  10  as  "program  control"  or  "sequence 
determining"  instmctions.  Five  additional  instructions  may  be 


classified  as  "external"  or  "input/output"  instructions.  All  but 
three  of  the  24  arithmetic  instmctions  fit  into  a  symmetric  scheme 
of  classification  wherein  there  are  seven  basic  operations,  each 
having  three  distinct  modes.  The  seven  basic  operations  are — add, 
subtract,  absolute  subtract,  multiply,  divide,  square  root  and  insert. 
The  three  modes  are — Replace,  Hold  and  Store.  If  we  let  the 
capital  letter  "G"  identify  the  first  operand,  "H"  identify  the 
second  operand,  an  "°"  signify  an  arbitrary  operation,  the  sym- 
bol "— >"  indicate  replace,  and  "A"  the  word  in  the  accumulator, 
then  the  three  modes  may  be  characterized  as: 

Replace:       H  °  G  ^  H,  A 
Hold:       H  °  G  A 
Store:        A  °  G  ^  H,  A 

The  three  remaining  arithmetic  operations  are  Add  Accumulate 
wherein  the  contents  of  H  and  G  are  added  to  the  Accumulator; 


Chapter  38  |  The  RW-400— a  new  polymorphic  data  system  481 


Multiply  Accumulate  wherein  the  contents  of  H  are  multiplied 
by  G  and  added  to  A;  and  Transmit  where  the  contents  of  G  are 
stored  in  H. 

The  ten  program  control  instnictions  are  Store,  Store  Double 
Length  Accumulator,  Load  Accumulator,  Insert  Mask  in  the 
S  Register,  Stop,  Link  Jump,  Compare  Jump,  Tally  Jvimp,  Test 
Jump  and  a  Multi-purpose  Shift. 

The  five  external  instructions  are  those  which  cause  data  to 
be  transmitted  to  or  received  from  a  device  external  to  the  com- 
puter. Each  command  is  multi-purpose  in  nature  and  hence  equiv- 
alent to  several  conventional  external  instructions.  The  commands 
are — Command  Output,  Data  Input,  Conditional  Data  Input,  Data 
Output  and  Character  Transfer.  A  comprehensive  discussion  of  the 
variation  of  each  of  these  commands  is  not  pertinent  to  this  article. 


Suffice  it  to  say  that  commands  are  available  for  carrying  out  a 
wide  variety  of  intermodule  data  communication. 

The  interrupt  capability  of  a  Computer  Module  is  a  logical 
generalization  of  the  "trapping"  feature  found  on  several  conven- 
tional computers.  It  permits  the  automatic  interruption  of  a  pro- 
gram, at  the  option  of  the  program,  when  the  computer  module 
receives  an  "alert"  that  a  condition  requiring  attention  has  arisen. 
It  can  be  used  to  warn  the  program  when  an  error  of  some  type 
has  occurred,  minimize  unproductive  computer  waiting  time  while 
another  module  completes  its  task,  eliminate  many  programmed 
status  test  instnictions  and  provide  a  convenient  means  of  sub- 
jecting one  computer  module  to  the  control  of  another.  Program 
control  of  interruptions  within  a  CM-4(X)  is  accomplished  through 
the  sense  register  S.  This  register  may  be  filled  with  an  internipt 


CONTROL 

LOGIC 


NEXT 

ADDRESS 
COUNTER 


OP         ADDRESS  ADDRESS 
INSTRUCTION  REGISTER 


CENTRAL 
EXCHANGE 


INTERRUPT 
LOGIC 


INPUT  LINES 


OUTPUT  LINES 


INTERRUPT 
SENSING 
REGISTER 


£  r  I 


rxCHANGE  RfGISTFR 


ACCUMULATOR 


MAGNETIC 

CORE 
STORAGE 


CONTROL  PANEL 


ACCUMULATOR 
EXTENSION 


rrrm 


ALERT  CONDITIONS 


Fig.  3.  The  CM-400  Computer  Module. 


482  Part  5  |  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


RW-400  analysis  console. 

mask  by  means  of  the  Insert  S  instruction.  A  bit  by  bit  correspond- 
ence exists  between  the  S  register  and  the  interrupt  register  and 
the  interrupt  register  I  to  which  the  alert  lines  are  connected.  A 
Test  Jump  instmction  can  be  used  to  examine  the  coincidence 
between  these  registers  of  an  alert  signal  in  a  bit  position  corre- 
sponding to  a  one  in  the  S  register  mask.  If  an  alert  is  received 
by  the  computer  during  the  execution  of  an  instruction,  control 
will  be  transferred  to  memory  location  "O  "  at  the  end  of  the 
instruction  if,  and  only  if,  (a)  the  sense  bit  corresponding  to  the 
alert  is  a  "one,"  (b)  the  master  sense  bit  is  a  "one,"  and  (c)  the  in- 
struction was  not  an  "Insert  S."  The  master  sense  bit  in  the  S  reg- 
ister may  be  programmed  to  permit  the  interrupt  to  take  place 
according  to  the  interrupt  mask  or  to  inhibit  interrupt  imtil  the 
program  can  conveniently  cope  with  it.  All  instructions  being 
executed  at  the  time  an  interrupt  condition  occurs  are  completed 
before  the  interruption  is  allowed  to  take  place. 

Figvire  .3  schematically  illustrates  the  Computer  Module's  pri- 
mary registers  and  the  interconnecting  information  paths. 

Typical  two-address  addition  and  subtraction  times  are  ap- 
proximately .35  microseconds  including  memory  access  time.  Mul- 
tiplication takes  about  80  microseconds,  and  division  and  square 
root  about  130  and  170  microseconds  respectively. 

Before  attempting  to  draw  a  comparison  between  a  CM  and 
a  deluxe  conventional  computer  the  reader  should  bear  in  mind 


the  trade  offs  in  features  versus  cost;  parallel  processing  versus 
sequential  processing;  independent  information  handling  versus 
program  complicating  "housekeeping";  and  real  time  system  reli- 
ability versus  periodic  inoperability.  The  only  valid  comparison 
is  that  between  the  RW-400  Data  System  and  a  conventional 
computer  applied  to  the  same  task.  The  contribution  to  the 
RW-400  system  made  bv  the  Buffer  Modules  can  be  better  assessed 
by  the  reader  after  the  following  description  has  been  considered. 

The  buffer  module 

A  Buffer  Module  consists  of  two  independent  logical  buffer  units, 
each  having  1024  words  of  random  access  magnetic  core  storage 
and  a  number  of  internal  registers  used  in  performing  its  functions 
when  in  the  self-controlling  mode.  A  Buffer  Module  may  be  con- 
nected to  a  Computer  Module  so  that  the  Buffer's  core  storage  is 
accessible  to  the  computer  as  an  extension  of  the  computer's  own 
storage.  A  Buffer  may  also  serve  as  an  intermediary  device  between 
a  computer  and  another  module,  such  as  a  tape  or  drum,  to 
minimize  time  conventionally  lost  in  data  transfers.  The  Buffer 
is  capable  of  recognizing  and  executing  certain  instructions  stored 
in  its  own  memory.  It  can  therefore  be  left  to  perform  data  han- 
dling fimctions  on  its  own  while  computer  modules  are  otherwise 
occupied. 

A  Buffer  Module  may  be  connected  to  a  Computer  Module 
and  the  buffer  1024  word  storage  used  as  an  indirectly  addressed 
extension  of  the  computer's  own  working  storage.  When  the  ad- 
dress 1023  (all  ones)  appears  in  the  operand  field  of  a  computer 
instruction  to  be  executed,  the  computer  is  signalled  that  the 
operand  refers  to  some  cell  in  buffer  storage.  The  computer  then 
uses  the  number  in  the  buffer  read  register  R  (or  in  the  case  of 
a  few  instmctions,  the  buffer  write  register  W)  as  the  effective 
address  designated  by  the  operand  field  of  the  instruction.  Ex- 
tended addressing  may  be  used  in  either  the  first  or  second  operand 
field  of  the  instruction  or  in  both  operand  fields.  If  extended 
addressing  is  used  in  only  one  operand  field,  the  effective  address 
designated  by  that  field  is  the  number  in  register  R.  A  "1"  is 
automatically  added  to  the  contents  of  the  R  register  after  the 
instruction  is  executed.  If  extended  addressing  is  used  in  both 
operand  fields  of  an  instruction,  the  effective  address  of  the  first 
operand  is  the  number  in  register  R  and  the  effective  address  of 
the  second  operand  is  one  more  than  the  number  in  register  R. 
A  "2"  is  automatically  added  to  the  contents  of  register  R  after 
the  execution  of  this  type  of  instruction.  The  R  (or  W)  register 
may  be  preset  to  any  desired  initial  condition  by  means  of  the 
computer's  Command  Output  instruction.  All  the  commands  being 
executed  by  the  computer  must  be  stored  within  the  computer 


Chapter  38  |  The  RW-400— a  new  polymorphic  data  system  483 


module's  storage  and  may  not  be  in  buffer  cells  addressed  by  the 
computer  at  execution  time.  The  extended  addressing  and  buffer 
register  indexing  may  be  used  to  materially  simplify  repetitive  data 
acquisition  operations. 

The  primary  function  of  a  Buffer  Module  is  not,  however,  that 
of  an  auxiliary  computer  storage  unit.  The  drum  and  tape  modules 
more  aptly  serve  this  fimction  in  the  RW-400  system.  .\  Buffer 
Module  is  capable  of  operating  autonomouslv  and  of  controlling 
other  modules  such  as  Tape  Modules,  Drum  Modules,  Peripheral 
Buffers,  Display  Buffers,  Printers  or  Plotters.  This  capability  en- 
ables the  Buffer  Modules  in  a  system  to  perform  routine  tape 
searching  and  data  transferral  tasks  thereby  freeing  the  Computer 
Modules  to  do  more  computing.  In  its  "self-instruction"  mode,  the 
buffer  executes  its  own  internally  stored  program  in  much  the  same 
fashion  as  a  computer.  The  memory  of  a  Buffer  .Module  will 
therefore  be  occupied  by  its  own  control  programs  as  well  as  blocks 
of  data  which  it  is  holding  for  transmission  to  other  units.  The 
buffer  is  used  to  acquire  information  from  the  relatively  slower 
auxiliary  storage  and  communication  modules  while  the  computer 
proceeds  at  high  speed.  Blocks  of  information  retrieved  in  advance 
of  computer  need  by  the  buffer  may  then  be  rapidly  transferred 
to  the  computer's  own  storage  or  operated  upon  as  they  stand  in 
the  buffer  via  the  indirect  addressing  capabilitv  of  the  computer. 
Another  feature  of  the  buffer  is  its  switching  capabilitv.  Each 
Buffer  Module  is  composed  of  two  buffer  units  tied  together.  .\ 
unit  function  switching  feature  permits  the  emplovment  of  the 
two  units  together  in  an  alternating  mode  of  operation.  Continuous 
information  transfer  from  tape  to  computer,  for  example,  mav  be 
accomplished  without  stopping  the  tape  unit.  .\  switching  in- 
struction executed  siniultaneouslv  bv  both  units  of  a  Buffer  Module 
causes  whatever  devices  were  connected  to  the  first  unit  to  be 
connected  to  the  second  and  vice  versa. 

Now  that  the  fiuictional  controlling  modules  and  the  module 
interconnection  concept  have  been  discussed,  the  more  conven- 
tional auxiliary  storage  modules  available  with  the  system  mav  be 
described  to  round  out  the  processing  capability  of  the  system. 

The  tape  modules 

A  Tape  Module  consists  of  an  altered  ,\mpex  FR-300  tape  transport 
plus  the  necessary  power  supplies  and  control  circuitrv  to  effect 
information  reading,  writing  and  control.  One  inch  mylar  tape  is 
used.  Information  is  written  on  16  channels — two  of  which  are 
clock  channels.  The  remaining  14  channels  consist  of  13  informa- 
tion bits  plus  parity.  The  information  reading  or  recording  rate 
is  1.5,000  computer  words  per  second.  Data  mav  be  recorded  on 
tape  in  variable  blocks  up  to  a  maximum  of  1024  words  per  block 


(the  size  of  the  storage  available  to  hold  the  data  in  a  sending 
or  receiving  module).  Each  block  is  preceded  bv  a  block  identi- 
fication which  permits  selective  tape  information  searching  bv  a 
Buffer  Module.  Single  blocks  imbedded  in  a  tape  file  of  other 
blocks  can  be  overwritten.  A  two-stack  head  permits  automatic 
verification  of  each  block  as  it  is  written.  Readback  paritv  errors 
are  automaticallv  detected  during  the  writing  process.  Thus  drop- 
out areas  mav  be  determined  while  the  data  is  still  available  in 
a  computer  or  buffer  for  recording  elsewhere. 

.\  description  of  the  B\\  -400's  tape  handling  capabilit\  would 
not  be  complete  without  mentioning  the  Tape  Adapter  (TA) 
module.  This  is  a  self-contained  unit  capable  of  performing  the 
reading  and  writing  of  magnetic  tapes  in  a  format  acceptable  to 
the  IBM  704  and  7(W  systems.  The  TA  consists  of  an  Ampex  FR-.300 
half-inch  digital  tape  transport,  including  dual  gap  head  and  servo 
control  system;  reading,  writing  and  control  circuits;  and  a  module 
housing  with  its  own  blower  and  power  supply. 


[  pni  J 

■B  mM 

I  liiiuii  ] 

m 

mm 

■iimi 

: 

in  imuiiuiiii 

r     =  1 

•~t                    •  |_: 
I 

^               B  ! 

a ..  o~ 
;  Of 

m 

Hi 

iiiiMi 

Hliiil 

■iiii 

H 

mim 

■1111) 

mmm 

! 

MPPumnininiwHi 

RW-400  Buffer  Module. 


484  Part  5  |  The  PMS  level 


Section  4  j  Network  computers  and  computer  networks 


The  drum  module 

The  Drum  Module  (DM)  contains  a  magnetic  drum  with  storage 
capacity  of  8192  words.  It  may  be  connected  to  either  a  Computer 
or  a  Buffer  Module  through  the  Central  Exchange.  Average  access 
time  to  the  first  word  position  on  the  drinn  is  Sy.,  milliseconds. 
Successive  words  are  transmitted  at  the  rate  of  60,000  computer 
words  per  second.  The  Drum  Module  is  conventionally  used  as 
an  intermediate  item  storage  device  to  minimize  tape  handling 
time. 

Special  system  communication  modules 

The  external  data  and  man-machine  communication  of  the 
RW-400  Data  System  are  handled  via  dnun  buffer  modules.  A  wide 
variety  of  asynchronously  operated  equipment  is  speed  matched 
and  program  controlled  through  the  features  designed  into  these 
special  system  communication  modules. 

The  Peripheral  Buffer  (PB)  provides  input/output  buffers  for 
communication  between  Computer  or  Buffer  Modules  and  rela- 
tively slow  speed  external  devices  such  as  Flexowriters,  Plotters, 
Pimched  Tape  Handlers,  Teletype  Lines  and  Keyboard  Operated 
Equipment.  The  Peripheral  Buffer  stores  its  information  in  four 
pairs  of  bands  which  operate  alternately  as  circulating  registers. 
Each  band  contains  eight  input  and  eight  output  buffers  for  a  total 
of  32  input  buffers  and  32  output  buffers  in  each  Peripheral  Buffer 
Module.  Each  buffer  is  a  drum  band  sector  64  computer  words 
long.  Conventionally  one  input  and  one  output  buffer  sector  are 
connected  to  each  external  device  (.such  as  a  Flexowriter)  to  permit 
two-way  communication  between  the  external  device  and  the 
RW-400  system. 

The  display  buffer 

A  Display  Buffer  (DB)  acts  as  a  recirculating  storage  for  the 
cathode  ray  tube  display  units  in  a  Display  Console.  Information 
to  be  displayed  is  sent  to  the  DB  band  associated  with  a  particular 
display  tube  via  the  Central  Exchange.  The  Display  Buffer  sends 
only  status  information  back  to  other  system  modules  upon  request. 
The  information  displayed  on  any  tube  is  controlled  by  the  bit 
pattern  sent  to  the  Display  Buffer.  The  display  pattern  is  regener- 
ated 30  times  per  second  to  minimize  image  fading  and  flicker. 
The  preceding  explanation  of  the  Display  Buffer  has  little  meaning 
to  a  reader  unfamiliar  with  the  features  of  the  Display  Console 
itself.  This  console  is  therefore  described  in  more  detail  in  the 
following  paragraphs. 

Display  consoles 

Display  Consoles  can  give  a  problem  "analyst"  or  "monitor"  a 
visual  picture  of  the  status  or  results  of  any  information  being 


handled  by  the  RW-400  system.  In  addition  to  the  actual  Cathode 
Ray  Tube,  numerical  indicator,  signal  lamp  and  typewriter  infor- 
mation outputs,  several  types  of  keyboard  activated  system  control 
and  parameter  entry  facilities  are  provided  on  the  console.  The 
total  man-machine  communication  facility  represented  by  each 
console  is  designed  to  be  primarily  a  function  of  the  computer 
control  programs  initiated  by  the  analyst  via  his  console. 

A  set  of  Display  Control  Keys  generate  messages  which  are 
recorded  on  a  Peripheral  Buffer  sector  for  later  interpretation  and 
display  generation  bv  a  computer  program.  A  set  of  Process  Step 
Keys  are  provided  the  analyst  so  that  he  can  initiate  prepro- 
grammed .system  processing  variations.  Associated  with  the  Process 
Step  Keys  is  an  overlay  or  "program  card  "  which  permits  the 
assignment  of  a  variety  of  meanings  to  the  set  of  Process  Step  Keys. 
Insertion  of  the  overlay  by  the  analyst  gives  him  a  unique  label 
for  each  Process  Step  Key  and  automatically  cues  the  controlling 
computer  to  assign  the  corresponding  set  of  programs  to  each  key 
message.  A  Data  Entry  Keyboard  is  provided  on  the  console  so 
that  the  analyst  can  enter  control  parameters  when  asked  to  do 
so  via  the  display  devices. 

A  Joystick  Lever  affords  the  console  operator  a  means  of  con- 
trolling the  position  of  cross  hair  markers  on  the  cathode  ray 
display  tubes.  Associated  with  the  joystick  are  control  keys  which 
may  be  used  to  send  a  message  to  the  controlling  computer  speci- 
fying the  coordinates  of  the  cross  hairs.  Control  programs  may  be 
written,  for  example,  to  act  upon  this  information  to  reorient  the 
display  with  respect  to  the  area  selected  by  the  cross  hair  position. 

A  Light  Gun  is  also  provided  as  a  means  of  selecting  any  point 
on  the  cathode  ray  tube  displays.  The  gim  emits  a  small  beam 
of  light.  With  the  beam  centered  on  a  given  point  on  the  cathode 
ray  display  tube,  pressing  the  trigger  results  in  the  automatic 
generation  of  a  message  to  the  Peripheral  Buffer  specifying  the 
address  in  the  Display  Buffer  containing  the  coordinates  of  the 
selected  point. 

A  set  of  Status  and  Error  lights  are  contained  on  the  Display 
Console  to  provide  the  con.sole  operator  with  over-all  knowledge 
of  the  system  and  thus  minimize  conflicting  control  requests  and 
intermodule  interference.  For  example,  a  Peripheral  Buffer  may 
not  be  ready  to  accept  a  console  key  message  until  after  certain 
previously  requested  control  actions  have  been  completed.  The 
Status  Lights  indicate  this  condition  to  the  console  operator  so 
that  he  may  act  accordingly. 

The  printer  module 

The  Printer  Module  (PR)  is  basically  a  160  column,  900  line  per 
minute  Anelex  type  printer.  It  receives  information  from  either 
a  Computer  or  a  Buffer  module  via  the  Central  Exchange.  Indi- 


Chapter  38     The  RW  400— a  new  polymorphic  data  system  485 


vidiial  characters  to  be  printed  are  represented  by  a  6-bit  code 
and  are  transmitted  four  to  a  computer  word.  Zero  suppression, 
line  completion  and  information  block  end  codes  are  included  for 
format  control.  A  plugboard  is  provided  for  flexibility  in  columnar 
data  arrangement.  Paper  feed  is  controlled  by  means  of  a  loop 
of  7-channel  punched  paper  tape.  Control  of  the  printing  operation 
has  been  arranged  so  that  the  connected  control  module  mav  send 
line  headings  from  one  set  of  memory  locations,  stop  sending 
information  while  going  to  a  different  part  of  the  memory,  and 
then  proceed  to  send  data  from  this  new  set  of  memory  locations 
to  complete  a  line  of  print. 

The  punched  card  modules 

The  R\V-4()()  Data  System  may  be  equipped  with  a  high  speed 
punched  card  reading  module  (CR)  and  an  IBM  card  punch.  The 


CR  communicates  with  Computer  or  Buffer  modules  via  the 
Central  Exchange.  It  is  capable  of  reading  80  column  punched 
cards  at  the  rate  of  2,500  cards  per  minute.  The  card  punch  is 
connected  to  the  system  through  the  Peripheral  Buffer  Module 
(PB)  since  it  is  a  relatively  low  speed  device.  Emphasis  has  not 
been  placed  on  directly  connected  punched  card  equipment  since 
the  sources  of  large  volumes  of  punched  cards  usually  convert  this 
data  into  magnetic  tape  form  which  may  be  more  rapidly  handled 
using  the  Tape  .-Vdapter  .Module  (T.\). 

References 

HothS59.  U  est(  M) 


486  Part  5  |  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


APPENDIX  1    RW  40  ISP  DESCRIPTION 


Appendix  1 

RW-itO   ISP  Description 

The  description  does  not  include  Input~Output  instructions ,  interrupts  and  cormruni cation  v)ith  the  other  cormuters  or  processors . 
The  description  was  taken  from  the  Preliminajry  Manual  of  Information  on  the  RW~40  and  is  no  doubt  changed  in  final  machines. 

Pc  State 

A<26: 1> 

Arithmetic  register 

B<2  6 ; 1 > 

extension  to  A 

ART  n  ■  I  1^7 f.  ■  1  -  ■  —  ilnR 

Arithiuetic  register  (double) 

POO:  ]> 

Proarcw  Counter 

Ov 

Overflow  for  arithmetic  shifts ^        -j  and  / 

SR<20: 1> 

Sense  Register 

Parity  error 

for  f4p  and  transfer  to  other  comouters 

Proqram  error 
Run 

undefined  cormand  or  incorrect  seauence  of  10  commands 

Mp  ^tate 

M[t :1022]<26;1> 

tip  register!^  0  and  J023  are  tnaccessibZe 

Pc  Conso Ze  State 

CJS<8: 1> 

conditional  dump  switches 

Control ^Dane 1 ^tes  t 

communication  indicator 

External  State  for  10  and  Other  Computers 

Tape^read 

tape  search  flag 

External  J\ddress/EA<IO:  1  > 

register  associated  Pc  to  address  another  module 

M  [0:  1023  ]<26  :  1  > 

extra  memory  being  accessed  by  External  Address  register 

1  j:ond<I9: 1> 

interrupt  conditions  to  Pc 

I0^elect<3: 1> 

1  of  8  10  devices  can  be  selected 

to  J)ata<l  3  : '  > 

10  device  Data 

Instruction  Format 

i  nstruct  ion/I <26 : 1> 

f/op<6: 1>  :=  i<26:21> 

function  or  ov  code  bits 

g<10:l>      :=  i<20:n> 

first  address 

j<5: 1>       :=  g<5: 1> 

test  selection  parameter 

h<10:l>      :=  I<10:  1> 

second  address 

Operand  Calculation  Process 

G<26:  1>  :=  (C  ;  next 

first  operand 

(g  =  '777g)   -»External^Address  ^  External^ddress  +  1) 

G'<26:1>  :=   ((g  =  0)  -  0; 

(0  <  g  <  1777)  ^M[q]<26:l>; 

{g  =  1777)  _  M[External^Address]<26:l>) 

H<26: 1>  :=   (H'  ;  next 

second  operand 

(h  =  1777)   ^External^Address   ^  External^Address  +  1) 

H-  -25:  1-^   :=   (  (h  =  0)  -  0; 

(0  h.:1777)  ^  M[h]<26: 1> 

(g  =  1777)  -  H[External^Address]<26:l>) 

Chapter  38  '  The  RW-400— a  new  polymorphic  data  system  487 


Instruction  Interpretation  Process 

Run-*  (  instruction  «_M[P];  P         +  1;  next  fetch 
Instructionjexecuti  on)  execute 

Instruction  Set  and  Instruction  Execution  Process 
Instruction^execution   :=  ( 


Transmi  t 

(:  = 

op 

27) 

(H  ^ 

G): 

Arithmetic  (1  's  complement) 

Replace  Add 

(:  = 

op 

0) 

(Ov.A 

,_  H  +  G:  next  M 

-A); 

Hold  Add 

(:=■ 

op 

1) 

-.  {Ov.A 

^  H  +  G) : 

Store  Add 

(:  = 

op 

2) 

-  (Ov.A 

.-A  +  G:  next  H 

.-A): 

Rep  1  ace  Subt ract 

(:  = 

op 

3) 

-,  (Ov.A 

-  H  -  G;  next  H 

-A); 

Hold  Subtract 

(:  = 

op 

M 

-t  (Ov.A 

~H  -  G): 

Store  Subtract 

(:  = 

op 

-■  (Ov.A 

.-A  -  G:  next  H 

.-  A)  ; 

Replace  Absolute  Subtract 

(:  = 

op 

6) 

-  (A  .-  , 

b5(H)  -  abs(G): 

next  H' 

Hold  Absolute  Subtract 

op 

7) 

-  (A     abs(H)  -  abs(G)) 

Store  Absolute  Subtract 

op 

10) 

-.  (A  „ 

abs(A)  -  abs(G) 

next  H 

Replace  Mul 1 1  ply 

(:  = 

op 

11) 

(AB  . 

-  H  X  G:  next  H' 

-  A)  : 

Hold  Multiply 

op 

12) 

(AB  . 

-H  X  n): 

Store  Hu 1 t I p 1 y 

(:  = 

op 

13) 

-  (AB  . 

-A  X  C:  next  H' 

-A): 

Replace  Divide 

(:  = 

op 

IM 

-  {(H  2  n)  -Ov  -  1 ; 

(H  ' 

G)  -•  ( 

A.E 

-  H/G:   next  H' 

-A)): 

Hold  Divide 

(:  = 

op 

15) 

-  ((H  2  G)  -Ov  -  \: 

(H  • 

G)  -  (A. 8  -  H/G)) : 

Store  Divide 

(;  = 

op 

16) 

-  ((A  S 

G)  -»0v  -  1; 

(A  <  G)  -  ( 

A.E 

-A/G;  next  H' 

-A)): 

Replace  Square  Root 

(:  = 

op 

17) 

-  (A  .- 

5grt(H+G):  next 

H-  -  A) 

Hold  Square  Root 

(:  = 

op 

20) 

-  (A  - 

sqrt(H+G)) : 

Store  Square  Root 

(:  = 

op 

21) 

-  (A  ^ 

5grt(A+G):  next 

H'  .-A) 

Accumulate  Add 

(:  = 

op 

25) 

-  (A  - 

OvoA  +  H  +  G)  ; 

Accumulate  Multiply 

(:  = 

op 

26) 

^  (A  ,- 

OvOA  +  H  X  G); 

Shift,  g<10:l>  is  used  to  sor.trol  the  shift  as  follous: 


g<5 ;  1  >  <^T>ecifies  number  of  shifts 

q<6>  =  ]  ->  shift  left;  g<6>  =  0  —  shift  right 

g<7>  =  1  ^  indicate  an  overflow 

g<8>  =  1  ->  round  the  result 

g<9>  =  1  -*  signed  arithmetic;  g<9>  =  0  {logical] 

g<10>=  0  -» A  is  the  operand;  g<\o  -  =  \  ~  AB  is  the  overand 

Shift  :=  (op  =  30)  ^  (OvnA  <- f  (A  x  25'^'5''',  B  ,  2'^^''^,  a<(- :  \  (!>) :  next  H  v-A): 
Logical  or  boolean  vector  data: 


Replace  Insert 

( : =  op  = 

22) 

-  (A 

.-  (H  A  - 

G)  V 

(A  ^ 

G)  ;  next  H'  ,-  A) 

Hold  Insert 

(:=  op  = 

23) 

-  (A 

-  (H  A 

G)  V 

(A  A 

G)); 

Store  Insert 

(:=  op  = 

2h) 

-  (A 

-  A  A  G  ; 

next 

H'  .- 

A): 

Test  Jump 

f :=  op  = 

31) 

-  ( 

488 


Part  5  I  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


(g<IO:7>  )!  0)  -  ( 

A  ^  ((g<10>  ->  l^jcond;  -,  g<10>  -  177777777)  A 
(g'.9>  -  SR;  -ng  g  >  -  177777777)  A 
Cg<8>  -  lOCiSelectalO Jata;      9<8>  -  177777777)  A 
(g<7>-CJS;  -.g-7>  ^  177777777));  next 
(g<6>  STest)  -  (P  ^  h)) ; 
The  Test  condition  is  a  selected  bit  of       or  other  Pc  or  TO  bits. 


Test 


=0)   -  0; 

£  j   £  32)   -  A<j>; 

=  33)   -  (Ov;  Ov  -  0) ; 

=  3'))   -  (Parity  error;  Parity  error  -0); 

=  35)  ^  (Control^panel  test;  Control^panel  u.test      0)  ; 

=  36)  -•  (Tape^read;  Tape^read  ^  0)  ; 

=  37)  -  (Program  ^rror ;  Program^error  •-0)) 


Link  Jump 

(:=  op  = 

32) 

-  ((g      0)   ,  (P  ^  h;  G  JO:  1 

-  P); 

(g  =  0)   -  (P  -  h))  ; 

Ta 1 1 y  Jump 

(:=  op  = 

33) 

-  ((G  =  ^0)   -  (P  -  h); 

(G  =  0)   -  ; 

(G  -  0)   -  (G'   -  G  -  1  ;  P 

-h); 

(G  <  0)  -  (G  ^  G  +  D)  ; 

Compare  Jump 

(;=  op  = 

37) 

-  (A       G)    -  P   -  h; 

Load  A 

(:=  op  = 

3't) 

-  (A  ^  ODgDh) ; 

Insert  S 

{:=  op  = 

35) 

-  (S  -  (A  A  (ODgOh))   V  (S  A 

(Oogah))) 

Store  AB 

(:=  op  = 

36) 

-.  (G  ^  B;   H  »-  A; 

(g  =  0)    A  (h  =  0)   -  (A  -  E 

;  B  -  A)) 

end  Instyuotion^execution 


Chapter  39 


Parallel  operation  in  the  Control  Data 
66001 

James  E.  Thornton 
History 

In  the  summer  of  1960,  Control  Data  began  a  project  which 
culminated  October.  1964  in  the  delivery  of  the  first  66()()  Com- 
puter. In  1960  it  was  apparent  that  bnite  force  circuit  perform- 
ance and  parallel  operation  were  the  two  main  approaches  to 
any  advanced  computer. 

This  paper  presents  some  of  the  considerations  having  to  do 
with  the  parallel  operations  in  the  6600.  A  most  important  and 
fortunate  event  coincided  with  the  beginning  of  the  6600  project. 
This  was  the  appearance  of  the  high-speed  silicon  transistor,  which 
survived  early  difficulties  to  become  the  basis  for  a  nice  jump  in 
circuit  performance. 

System  organization 

The  computing  system  envisioned  in  that  project,  and  now  called 
the  6600,  paid  special  attention  to  two  kinds  of  use,  the  very  large 
scientific  problem  and  the  time  sharing  of  smaller  problems.  For 
the  large  problem,  a  high-speed  floating  point  central  processor 
with  access  to  a  large  central  niemorv  was  obvious.  Not  so  obvious, 
but  important  to  the  6600  system  idea,  was  the  isolation  of  this 
central  arithmetic  from  anv  peripheral  activitv. 

It  was  from  this  general  line  of  reasoning  that  the  idea  of  a 
multiplicity  of  peripheral  processors  was  formed  (Fig.  1).  Ten  such 
peripheral  processors  have  access  to  the  central  memory  on  one 
side  and  the  peripheral  channels  on  the  other.  The  executive 
control  of  the  system  is  always  in  one  of  these  peripheral  proces- 
sors, with  the  others  operating  on  assigned  peripheral  or  control 
tasks.  All  ten  processors  have  access  to  twelve  input-output  chan- 
nels and  may  "change  hands."  monitor  channel  activitv,  and 
perform  other  related  jobs.  These  processors  have  access  to  central 
memory,  and  may  pursue  independent  transfers  to  and  from  this 
memory. 

Each  of  the  ten  peripheral  processors  contains  its  own  memorv 
for  program  and  buffer  areas,  thereby  isolating  and  protecting  the 

^AFIPS  Proc.  FJCC.  pt.  2  vol.  26,  pp.  33-40,  1964. 


more  critical  system  control  operations  in  the  separate  processors. 
The  central  processor  operates  from  the  central  memorv  with 
relocating  register  and  file  protection  for  each  program  in  central 
memorv. 

Peripheral  and  control  processors 

The  peripheral  and  control  processors  are  housed  in  one  chassis 
of  the  main  frame.  Each  processor  contains  4096  memory  words 
of  12  bits  length.  There  are  12-  and  24-bit  instmction  formats  to 
provide  for  direct,  indirect,  and  relative  addressing.  Instnictions 
provide  logical,  addition,  subtraction,  shift,  and  conditional 
branching.  Instnictions  also  provide  single  word  or  block  transfers 
to  and  from  any  of  twelve  peripheral  channels,  and  single  word 
or  block  transfers  to  and  from  central  memory.  Central  memory 
words  of  60  bits  length  are  assembled  from  five  consecutive  pe- 
ripheral words.  Each  processor  has  instructions  to  interrupt  the 
central  processor  and  to  monitor  the  central  program  address. 

To  get  this  much  processing  power  with  reasonable  economy 
and  space,  a  time-sharing  design  was  adopted  (Fig.(2).  This  design 
contains  a  register  "barrel  around  which  is  moving  the  dynamic 
information  for  all  ten  processors.  Such  things  as  program  address, 
accumulator  contents,  and  other  pieces  of  information  totalling 
52  bits  are  shifted  around  the  barrel.  Each  complete  trip  around 
requires  one  major  cycle  or  one  thousand  nanoseconds.  \  "slot" 
in  the  barrel  contains  adders,  assembly  networks,  distribution 
network,  and  interconnections  to  perform  one  step  of  any  periph- 
eral instmction.  The  time  to  perform  this  step  or,  in  other  words, 
the  time  through  the  slot,  is  one  minor  cycle  or  one  hundred 
nanoseconds.  Each  of  the  ten  processors,  therefore,  is  allowed  one 
minor  cvcle  of  everv  ten  to  perform  one  of  its  steps.  A  peripheral 
instruction  mav  require  one  or  more  of  these  steps,  depending  on 
the  kind  of  instruction. 

In  effect,  the  single  arithmetic  and  the  single  distribution  and 
assembly  network  are  made  to  appear  as  ten.  Only  the  memories 
are  kept  truly  independent.  Incidentally,  the  memory  read-w'rite 
cvcle  time  is  equal  to  one  complete  trip  around  the  barrel,  or  one 
thousand  nanoseconds. 


490  Part  5     The  PMS  level 


Section  4     Network  computers  and  computer  networks 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PHERIPHERAL 
&  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
8.  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


6600  CENTRAL  MEMORY 


6600  CENTRAL  PROCESSOR 


6600  CENTRAL  MEMORY 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


4096  WORD 
CORE  MEMORY 


PERIPHERAL 
&  CONTROL 
PROCESSOR 


Fig.  1.  Control  Data  6600. 


PROCESSOR 
REGISTERS 


TIME-SHARED 
INSTRUCTION 
CONTKOL 


CENTRAL 
MEMORY  ' 
(60) 


(60) 


READ  PYRAMID 


(36) 


(24) 


(12) 


(12) 


PROCESSOR 
MEMORIES 


WRITE  PYRAMID 


(12) 


(12) 


(24) 


(36) 


(60) 


(12) 


0 

1 

2 

3 

4 

5 

6 

7 

10 

11 

12 

13 

14 

EXTERNAL  EQUIPMENT 


CENTRAL 
MEMORY 
(60) 


 v-  REAL  TIME 

1/0  CHANNELS 


Fig.  2.  6600  peripheral  and  control  processors. 


Chapter  39     Parallel  operation  in  the  Control  Data  6600  491 


Input-output  channels  are  hi-diiectional,  12-bit  paths.  One 
12-bit  word  may  move  in  one  direction  every  major  cycle,  or  lOOO 
nanoseconds,  on  each  channel.  Therefore,  a  ma.\imum  burst  rate 
of  120  million  bits  per  second  is  possible  using  all  ten  peripheral 
processors.  A  sustained  rate  of  about  50  million  bits  per  second 
can  be  maintained  in  a  practical  operating  system.  Each  channel 
may  service  several  peripheral  devices  and  may  interface  to  other 
systems,  such  as  satellite  computers. 

Peripheral  and  control  processors  access  central  memory 
through  an  assembly  network  and  a  dis-assembly  network.  Since 
five  peripheral  memory  references  are  required  to  make  up  one 
central  memory  word,  a  natural  assembly  network  of  five  levels 
is  used.  This  allows  five  references  to  be  "nested"  in  each  network 
during  any  major  cycle.  The  central  memory  is  organized  in 
independent  banks  with  the  ability  to  transfer  central  words  every 
minor  cycle.  The  peripheral  processors,  therefore,  introduce  at 
most  about  2%  interference  at  the  central  memorv  address  control. 


A  single  real  time  clock,  continuously  running,  is  available  to 
all  peripheral  processors. 


Central  processor 

The  6600  central  proces.sor  may  be  considered  the  high-speed 
arithmetic  unit  of  the  system  (Fig.  3).  Its  program,  operands,  and 
results  are  held  in  the  central  memory.  It  has  no  connection  to 
the  peripheral  processors  e.xcept  through  memorv  and  except  for 
two  single  controls.  These  are  the  e.xchange  jump,  which  starts 
or  interrupts  the  central  processor  from  a  peripheral  processor, 
and  the  central  program  address  which  can  be  monitored  by  a 
peripheral  processor. 

.\  key  description  of  the  66(K)  central  processor,  as  vou  will 
see  in  later  discussion,  is  "parallel  by  function."  This  means  that 
a  ninnber  of  arithmetic  functions  mav  be  performed  concurrently. 
To  this  end,  there  are  ten  functional  units  within  the  central 


PERIPHERAL  AND 
CONTROL  PROCESSORS 


10 

9 

8 

7 

6 

5 

4 

3 

2 

1 

)2 

OUTPUT 


INPUT 
CHANNELS 


CENTRAL  PROCESSOR 


24 

OPERATING 
REGISTERS 


MULTIPLY 


MULTIPLr 


DIVIDE 


LONG  ADD 


SHIFT 


BOOLEAN 


INCREMENT 


INCREMENT 


BRANCH 


Fig.  3.  Block  diagram  of  6600. 


492  Part  5  |  The  PMS  level 


Section  4  j  Network  computers  and  computer  networks 


processor.  These  are  the  two  increment  units,  floating  add  unit, 
fixed  add  unit,  shift  unit,  two  multiply  units,  divide  unit,  boolean 
unit,  and  branch  unit.  In  a  general  way,  each  of  these  units  is  a 
three  address  unit.  As  an  example,  the  floating  add  unit  obtains 
two  60-bit  operands  from  the  central  registers  and  produces  a 
60-bit  result  which  is  returned  to  a  register.  Information  to  and 
from  these  units  is  held  in  the  central  registers,  of  which  there 
are  twenty-four.  Eight  of  these  are  considered  index  registers,  are 
of  18  bits  length,  and  one  of  which  always  contains  zero.  Eight 
are  considered  address  registers,  are  of  18  bits  length,  and  serve 
to  address  the  five  read  central  memory  tnmks  and  the  two  store 
central  memory  trunks.  Eight  are  considered  floating  point  regis- 
ters, are  of  60  bits  length,  and  are  the  only  central  registers  to 
access  central  memory  during  a  central  program. 

In  a  sense,  just  as  the  whole  central  processor  is  hidden  behind 
central  memory  from  the  peripheral  processors,  so,  too,  the  ten 
functional  units  are  hidden  behind  the  central  registers  from 
central  memory.  As  a  consequence,  a  considerable  instruction 
efficiency  is  obtained  and  an  interesting  form  of  concurrency  is 
feasible  and  practical.  The  fact  that  a  small  number  of  bits  can 
give  meaningful  definition  to  anv  fimction  makes  it  possible  to 
develop  forms  of  operand  and  unit  reservations  needed  for  a 
general  scheme  of  concurrent  arithmetic. 

Instructions  are  organized  in  two  formats,  a  15-bit  format  and 
a  30-bit  format,  and  may  be  mixed  in  an  instruction  word  (Fig. 
4).  As  an  example,  a  15-bit  instruction  may  call  for  an  ADD, 


f  m  i  I  h 


3  3  3  3  3  15  BITS 


OPE8ATION 
CODE 

60  BITS  IIESUII 
0  SEG 
(1  ol  8) 


111  OPE8AND 
REG 
(I  ol  81 


2nd  OPEIIANO 
REG 

(I  ol  81 


Fig.  4.  Fifteen-bit  instruction  format. 


designated  by  the  /  and  m  octal  digits,  from  registers  designated 
by  the  /  and  k  octal  digits,  the  result  going  to  the  register  desig-  ' 
nated  by  the  /  octal  digit.  In  this  example,  the  addresses  of  the 
three-address,  floating  add  unit  are  only  three  bits  in  length,  each 
address  referring  to  one  of  the  eight  floating  point  registers.  The 
.30-bit  format  follows  this  same  form  but  substitutes  for  the  k  octal 
digit  an  18-bit  constant  K  which  sei'ves  as  one  of  the  input  oper- 
ands. These  two  formats  provide  a  highly  efficient  control  of 
concurrent  operations. 

As  a  background,  consider  the  essential  difference  between  a  ' 
general  purpose  device  and  a  special  device  in  which  high  speeds  j 
are  required.  The  designer  of  the  special  device  can  generally 
improve  on  the  traditional  general  purpose  device  by  introducing 
some  form  of  concurrency.  For  example,  some  activities  of  a 
housekeeping  nature  may  be  performed  separate  from  the  main 
sequence  of  operations  in  separate  hardware.  The  total  time  to  : 
complete  a  job  is  then  optimized  to  the  main  sequence  and  excludes 
the  housekeeping.  The  two  categories  operate  concurrently. 

It  would  be,  of  course,  most  attractive  to  provide  in  a  general 
purpose  device  some  generalized  scheme  to  do  the  same  kind  of 
thing.  The  organization  of  the  6600  central  processor  provides  just 
this  kind  of  scheme.  With  a  multiplicity  of  functional  units,  and 
of  operand  registers  and  with  a  simple  and  highlv  efficient  address-  j 
ing  system,  a  generalized  queue  and  reservation  scheme  is  practi- 
cal. This  is  called  the  scoreboard. 

The  scoreboard  maintains  a  running  file  of  each  central  register, 
of  each  functional  unit,  and  of  each  of  the  three  operand  trunks 
to  and  from  each  unit.  Typically,  the  scoreboard  file  is  made  up 
of  two-,  three-,  and  four-bit  quantities  identifying  the  nature  of 
register  and  unit  usage.  As  each  new  instruction  is  brought  up, 
the  conditions  at  the  instant  of  issuance  are  set  into  the  scoreboard.  ' 
A  snapshot  is  taken,  so  to  speak,  of  the  pertinent  conditions.  If 
no  waiting  is  required,  the  execution  of  the  instruction  is  begim 
immediately  under  control  of  the  unit  itself.  If  waiting  is  required 
(for  example,  an  input  operand  may  not  yet  be  available  in  the 
central  registers),  the  scoreboard  controls  the  delay,  and  when 
released,  allows  the  unit  to  begin  its  execution.  Most  important, 
this  activity  is  accomplished  in  the  scoreboard  and  the  functional 
unit,  and  does  not  necessarily  limit  later  instructions  from  being 
brought  up  and  issued. 

In  this  manner,  it  is  possible  to  issue  a  series  of  instructions, 
some  related,  some  not,  until  no  fimctional  units  are  left  free  or 
until  a  specific  register  is  to  be  assigned  more  than  one  result.  With  i 
just  those  two  restrictions  on  issuing  (unit  free  and  no  double  ^ 
result),  several  independent  chains  of  instructions  may  proceed  i 
conciurently.  Instructions  may  issue  every  minor  cycle  in  the 


Chapter  39  ,  Parallel  operation  in  the  Control  Data  6600  493 


alisence  of  the  two  restraints.  The  instruction  executions,  in  com- 
piirison,  range  from  three  minor  cycles  for  fixed  add,  10  minor 
c\  cles  for  floating  multiply,  to  29  minor  cycles  for  floating  divide. 

To  provide  a  relatively  continuous  source  of  instructions,  one 
Iniffer  register  of  60  bits  is  located  at  the  bottom  of  an  instruction 
stack  capable  of  holding  32  instructions  (Fig.  5).  Instruction  words 
from  memory  enter  the  bottom  register  of  the  stack  pushing  up 
tlie  old  instruction  words.  In  straight  line  programs,  only  the 
bottom  two  registers  are  in  use,  the  bottom  being  refilled  as  quickh 
as  memory  conflicts  allow.  In  programs  which  branch  back  to  an 
instruction  in  the  upper  stack  registers,  no  refills  are  allowed  after 
the  branch,  thereby  holding  the  program  loop  completely  in  the 
stack.  As  a  result,  memory  access  or  memory  conflicts  are  no  longer 
involved,  and  a  considerable  speed  increase  can  be  had. 

Five  memory  trunks  are  provided  from  memorv  into  the  central 
processor  to  five  of  the  floating  point  registers  (Fig.  6).  One  address 
register  is  assigned  to  each  trunk  (and  therefore  to  the  floating 
point  register).  Any  instruction  calling  for  address  register  re.sult 
iiiiplicitlv  initiates  a  memory  reference  on  that  tnmk.  These  in- 
structions are  handled  through  the  scoreboard  and  therefore  tend 
to  overlap  memory  access  with  arithmetic.  For  example,  a  new 
memory  word  to  be  loaded  in  a  floating  point  register  can  be 
brought  in  from  memory  but  ma\  not  enter  the  register  until  ail 


previous  u.ses  of  that  register  are  completed.  The  central  registers, 
therefore,  provide  all  of  the  data  to  the  ten  functional  units,  and 
receive  all  of  the  unit  results.  No  storage  is  maintained  in  any  unit. 

Central  memory  is  organized  in  32  banks  of  4096  words.  Con- 
secutive addresses  call  for  a  different  bank;  therefore,  adjacent 
addresses  in  one  bank  are  in  reality  separated  bv  32.  .Addresses 
may  be  issued  ever\'  1(H)  nanoseconds.  A  typical  central  memory 
information  transfer  rate  is  about  2.50  million  bits  per  second. 

As  mentioned  before,  the  fiuictional  units  are  hidden  behind 
the  registers,  .\lthough  the  units  might  appear  to  increase  hard- 
ware duplication,  a  pleasant  fact  emerges  from  this  design.  Each 
unit  may  be  trimmed  to  perform  its  fimction  without  regard  to 
others.  Speed  increases  are  had  from  this  simplified  design. 

.\s  an  example  of  special  fiuictional  unit  design,  the  floating 
mviltipl)'  accomplishes  the  coefficient  multiplication  in  nine  minor 
cycles  plus  one  minor  cycle  to  put  away  the  result  for  a  total  of 
10  minor  cycles,  or  10(K)  nanoseconds.  The  multiply  uses  lavers 
of  carry  save  adders  grouped  in  two  halves.  Each  half  concurrently 
forms  a  partial  product,  and  the  two  partial  products  finally  merge 
while  the  long  carries  propagate.  .Vlthough  this  is  a  fairl\'  large 
complex  of  circuits,  the  resulting  device  was  sufficiently  smaller 
than  originally  planned  to  allow  two  multiply  units  to  be  included 
in  the  final  design. 


INSTRUCTION 
STACK 
8  60-BIT 
WORDS 


FROM  CENTRAL  MEMORY 


BUFFER  REGISTER 


INSTRUCTION 
REGISTERS 


Fig.  5.  6600  instruction  stack  operation. 


494  Part  5  |  The  PMS  level 


Section  4     Network  computers  and  computer  networks 


CENTRAL 
MEMORY 


RESULTS 


ADDRESSES  (18-BIT) 


AO 


INSTRUCTIONS 


OPERANDS 
(60-BIT) 


INCREMENT 
(18-BIT) 


10  FUNCTIONAL 
UNITS 


INSTRUCTION 
REGISTERS 

INSTRUCTION 
STACK 

(UP  TO  8  WORDS 
60 -BIT) 

Fig.  6.  Central  processor  operating  registers. 


To  sum  up  the  characteristics  of  the  central  processor,  remem- 
ber that  the  broadbmsh  description  is  "concurrent  operation."  In 
other  words,  any  program  operating  within  the  central  processor 
utilizes  some  of  the  available  concurrency.  The  program  need  not 
be  written  in  a  particular  way,  although  centainly  some  optimiza- 
tion can  be  done.  The  specific  method  of  accomplishing  this 
concurrency  involves  issuing  as  many  instructions  as  possible  while 
handling  most  of  the  conflicts  during  execution.  Some  of  the  essen- 
tial requirements  for  such  a  scheme  include: 

1  Many  fiuictional  units 

2  Units  with  three  address  properties 

3  Many  transient  registers  with  many  trunks  to  and  from 
the  imits 

4  A  simple  and  efficient  instruction  set 


Construction 

Circuits  in  the  6600  computing  system  use  all-transistor  logic  (Fig. 
7).  The  silicon  transistor  operates  in  saturation  when  switched 
"on"  and  averages  about  five  nanoseconds  of  stage  delay.  Logic 
circuits  are  constructed  in  a  cordwood  plug-in  module  of  about 
inches  by  inches  by  0.8  inch.  An  average  of  about  50 
transistors  are  contained  in  these  modules. 

Memory  circuits  are  constructed  in  a  plug-in  module  of  about 
six  inches  by  six  inches  by  2y^  inches  (Fig.  8).  Each  memory  module 
contains  a  coincident  current  memory  of  4096  12-bit  words.  All 
read-write  drive  circuits  and  bit  drive  circuits  plus  address  trans- 
lation are  contained  in  the  module.  One  such  module  is  used  for 
each  peripheral  processor,  and  five  modules  make  up  one  bank 
of  central  memory. 

Logic  modules  and  memory  modules  are  held  in  upright  hinged 
chassis  in  an  X  shaped  cabinet  (Fig.  9).  Interconnections  between 
modules  on  the  chassis  are  made  with  twisted  pair  transmission 


Chapter  39  '  Parallel  operation  in  the  Control  Data  6600  495 


Fig.  7.  6600  printed  circuit  module. 


lines.  Interconnections  between  cha.ssis  are  made  with  coaxial 
cables. 

Both  maintenance  and  operation  are  accomplished  at  a  pro- 
grammed display  console  (Fig.  10).  More  than  one  of  these  consoles 
may  be  included  in  a  system  if  desired.  Dead  start  facilities  bring 


Fig.  8.  6600  memory  module. 


Fig.  9.  6600  main  frame  section. 


Fig.  10.  6600  display  console. 


Part  5  I  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


the  ten  peripheral  processors  to  a  condition  which  allows  infor- 
mation to  enter  from  any  chosen  peripheral  device.  Such  loads 
normally  bring  in  an  operating  system  which  provides  a  highly 
sophisticated  capability  for  multiple  users,  maintenance,  and  so 
on. 

The  6600  Computer  has  taken  advantage  of  certain  technology 
advances,  but  more  particularly,  logic  organization  advances 


which  now  appear  to  be  quite  successful.  Control  Data  is  exploring 
advances  in  technology  upward  within  the  same  compatible 
structure,  and  identical  technology  downward,  also  within  the 
same  compatible  structure. 

References 

AllaR64;  ClayB64 


Chapter  39  |  Parallel  operation  in  the  Control  Data  6600  497 


APPENDIX  1    CDC  6400,  6500,  6600 
CENTRAL  PROCESSOR  ISP  DESCRIPTION 


Append 

X  1 

CDC  6^*00,  6500, 

6600  Central 

Processor   ISP  Description 

Pc  state 

P<  1  7  :  0> 

Program  counter 

X[0:7]<59:0> 
A[0:7]<I7:0> 
B[0]<I7:0>     :=  0 
B[l :7]<17:0> 

Main  arithmetic  registers.  X[J:5],  are  implicitly  loaded  from 
Mp  when  ^[J.'S]  are  loaded.     X{.6:7^  are  vnpliaitly  stored  in 
Hp  when  Aid:?]  are  loaded. 

B  registers  are  genei*al  arithmetic  registers^  and  can  he  used 
as  index  registers . 

Run 

1  if  interpreting  instructions ^  not  under  program  control. 

EM<I7:0> 

Exit  mode  hits 

Address  jDutwOf^range^mode     :=  EM<12> 

OperandjDut  jDf  wTangec^mode     :=  EM<13> 

Indef  in  i  te  jDperandwmode        :=  EM<I 

The  above  desaript 
an  alarm  condition 

ion  is  incomplete  in  that  the  above  3  mode's  alarri  allow  conditions  to  trap  Pc  at  MplRA],    Trapping  occurs  if 
occurs  "and''  the  mode  is  a  one. 

Mp  State 

main  core  memory  of  2^^  u,   (2S6  hjj 

Mp [0:7777778  ]<59:0> 

Ms  [0:2015232  ]<S9:0> 

ECS/Extended  Core  Storage  Program  can  only  transfer  data  between 
Mp  and  Ms.     Program  cannot  ne  executed  in  Ms., 

RA<17:0> 

reference  (or  relocation)  address  register  to  map  a  logical  Mp' 
into  physical  Mp 

FL<I7:0> 

field  length  -  the  bounds  register  vhiah  limits  a  program's 
access  to  a  range  of  Mp ' 

RAECS<59:36> 

reference  or  relocation  register  for  Ms{Extended  Core  Storage) 

FLECS<59:36> 

field  length  for  ECS 

Address^ut  jof  ^range 

a  bit  denoting  a  state  vhen  memory  mapping  is  invalid 

Menory  Mapping  Process 
This  process  maps  or  relocates  a  logical  program^  at 

location  Mp 

and  Ms'^into  physical  Mp  and  Ms. 

Mp'[X]    :=  ((X  < 

FL)   -»Mp[X  +  RA])  ; 

logical  Mp ' 

(X  2 

FL)  —  (Run  —0;  Address^ut  jaf- rar 

ge  ^D) 

Ms'[X]    :=  ((X  < 

FLECS)  -Ms[X]+  RAECS]); 

logical  Ms  ' 

(X  2 

FLECS)  ->  (Run  >-0;  Address^outwOf,^ 

range  «-  1 ) ) 

Exchange  jump  storage  allocation  map  at  location^  n 

thin  '-fp: 

The  following  Mp"  array  is  reserved  when  Pc  state  is  stored^  and  switched  to  another  Job,    The  exchange  jun^p  instruction  in 
a  Peripheral  and  Control  Processor  enacts  the  operation:    <Mp"  t-  lip;  Up  t-  Hp"). 

Mp"[n]<53:0> 

=  PoA[0]DC00000g 

Mp"[n+t]<53:0> 

=  RAoA[l]aB[l] 

Mp"[n+2]<53:0> 

=  FLciA[2]aB[2] 

Mp"[n+3]<53:0> 

-  EMaA[3]DB[3] 

Mp"[n+'t] 

=  RAECSnA[l4]aB[l|] 

Mp"[n+5] 

=  FLECSdA[5]dB[5] 

Mp"[n+6]<35:0> 

-  A[6]aB[6] 

Mp"[n+7]<35:0> 

=  A[7]nB[7] 

Mp"[n+I0g:n+I7g] 

=  Xt0:7] 

498  Part  5  |  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


Instruction  Format 
in5truction<29:0> 

fm<S:0> 

fmi<8:0> 

i<2:n> 

j<2:0> 

k<2:0> 

jl<<5:0> 

K<17:0> 

1  ong^I  ns  t  ruct  i  oi 


short nst  ruct  i  on 


instruction<29t2'(> 
fmD! 

instruction<23:21> 
instruct  ion<20 : 1 8> 
Inst ruct  ion<l 7: 15> 

irk 

i  nst ruct  ion<l 7:0> 
:=   ({fm  <  lOg)  V 

(50  £  fm  <  53)  V 
(60  £  fm  <  63)  V 
(70  s  fm  <  73)) 
:=       long  instruction 


although  SO  bits^  most  instructions  are  25  bits;  see 
Instruction  Interpretation  Process 

operation  code  or  function 

extended  op  code 

specifies  a  register  or  an  extension  to  op  code 

specifies  a  register 

specifies  a  register 

a  shift  constant  ( 6  bits) 

an  18  bit  address  size  constant 

30  bit  instruction 


25  bit  instruction 


Instruction  Interpretation  Process 
A  15  bit  (short)  or  30  bit  (long)  instruction  is  fetched  from  Mp''lP]<p  ^  15  +  IS  -  2:p  x  J5^  where  p  =  3^  2,  2^  or  0. 
bit  instruction  cannot  be  stored  across  word  boundaries  (or  in  2.  Mp'  locations) . 


P<'>1, 

Run       (instructIon<29:  15> 
p      p  -  I  ;  next 


-Mp'  [P]<(p  X  15  +  H)  :  (p  X  15)i 


a  pointer  to  15  bit  quarter  word  which  has  instruction 
next  fetch 


If  (0  <  i  <  6)  a  fetch  from 


(p  =  0)  A   1  ong^I  ns  t  ruct  I  on  ^  Run  <-0; 
(p      0)  A    long^Instruct  Ion  -  ( 

instructlon<li,:0>  <-Hp'[P]<(p  X  15  +  l'l):(p  X  15)>; 
p  <-  p  -  I )  ;  next 

I  ns  t  ruct  I  on^^execu  t  i  on  ;  next  execute 

(p  =  0)  -.  (p  ^3;  P  >-P  +  D) 

Instruction  Set  and  Instruction  Execution  Process 
(Operand  fetches  or  stores  between  Mp'  and  ^[t]  occur  bu  loading  or  storing  registers  Alii. 

tfp'lAliJ]  occurs.    If  a  ^  6)  a  store  is  made  to  Mp'[Alil'i.    The  description  does  not  describe  AddressJoui^ flange  case, 

which  is  treated  like  a  null  operation. 

I  ns  t  ruct  lon^xecut  I  on   :=  ( 

Set  A[i]/SA 

(fm  =  50)  -:•  (A[n  ^A[j]  +  K;  next  Fetch^Store)  ; 
(fm  =51)  ->  (A[I  ]  ^B[j  ]  +  K;  next  Fetch^Store)  ; 
(fm  =  52)   -.  (A[I]  ^X[j]<17:0>+  K;  next  Fetch^tore); 

<(A[I]  .-X[j  ].'17:n>  +  B[k);  next  Fetch„Store)  ; 
,(A[I]  <-A[j]+B[k];  next  Fetcli„Store)  ; 
<(A[I]  <-A[j]-  B[k];  next  Fetch^Store)  ; 
.(AT;]  .-Bfj]  +  Blk];  next  Fetch,  Store)  ; 
■  (A[I]  <-B[j]  -  E[k];  next  Fetch^Store)  ; 


"SAI 
"S«i 
"SAI 
"SAI 
"SAI 
"SAi 
"SAi  B. 
"SAi 

Fetch^Store   :=  ( 

(0  <  i   <  6)  -.  (X[I] 
(I  2  6)  -  (Mp'[A[I] 
Operations  on  B  and  X 
Set  B  [i  ySBi 

"SBI  AJ  +  K"   (fm  =  60) 


(fm  =  5?) 

(fm  =  5h) 

(fm  =  55) 

(fm  -  56) 

(fm  =  57) 


-Mp'[A[I]]); 
■  X[i])) 


i]  ^A[j]  +  K); 


process  to  get  operand  in  X  or  store  operand  from  X  when  A 
is  written 


Chapter  39     Parallel  operation  in  the  Control  Data  6600 


499 


"SBi 

BJ 

+ 

K" 

(fm  = 

61)  -  (B[i] 

^  B[j]  +  K) ; 

"SBi 

Xj 

+ 

K" 

(fm  = 

62)  (B[i] 

^  X[j]<l7:CI>  +  K); 

"SBi 

Xj 

+ 

Bk' 

(fm 

=  63)  (B[i 

]       X[j]<I7:CI>  +  B[k]) 

"SBi 

Aj 

+ 

Bk' 

(fm 

o  61))  -.  (B[i 

]  «-A[j]  +  B[k]); 

"SBi 

Aj 

Bk' 

(fm 

=  65)  -  (B[i 

]  -A[j]  -  B[k]); 

"SBi 

Bj 

+ 

Bk' 

(fm 

=  66)  -.  (B[  i 

]  -  B[J]  +  B[k]); 

"SBi 

Bj 

Bk' 

(fm 

=  67)  -  (B[  i 

]  -  Btj]  -  B[k]); 

Set  Xli]/SXi 

"SXi 

Aj 

+ 

K" 

(fm  =  70)  -.  (X[  i] 

sign^extend(A[ j]  + 

K))  ; 

"SX! 

Bj 

+ 

K" 

(fm  =  70  ^  (X[  !] 

*~  s  ign^xtend  (B[  j  ]  + 

K)): 

"SXi 

Xj 

+ 

K" 

(fm  = 

72)  -  (X[  i] 

<- 5  ign^ext:end(X[  j  ]  + 

K)); 

"SXi 

Xj 

+ 

Bk' 

=  73)  -  (X[  i 

]  <-  5  Ignjextend  (X[  j  ]  + 

B[k])); 

"SXi 

Aj 

+ 

Bk' 

=  7M  -.  (X[  i 

]  t-  sign^xtend(A[J  ]  + 

B[k])): 

"SXi 

Aj 

Bk' 

=  75)      (X[  i 

]      s  tgn^xtend  (A[j  }  ' 

B[k])); 

"SXi 

Bj 

+ 

Bk' 

=  76)  ^  (X[  i 

]  -  sign^xtend(B[J5  + 

Btk])); 

"SXi 

Bj 

Bk' 

(fm 

=  77)  -.  (X[  i 

]  t-  sign^xtend(B[J  ]  - 

B[k])); 

Misaellaneous  program  control 
"PS"   {:=  fm  =  0)  -.  (Run      O) ; 
"NO"   (:=  fm  =  hi)  ^  ; 

Jump  unconditional 

"JP  Bi  +  K"   (:=  fm  »  02)       (P  ^  B[  1] 

Jump  on  X[J]  conditions 


program  etop 

no  operation;  pass 


+  K;  p  . 


"ZR  Xj  K"  ( 

=  fmi  =  030)       ((X[j]  =  0)  -  (P 

-  K; 

P  - 

3)) 

,  zero 

"NZ  Xj  K"  ( 

=  fmi  =031)  ^  ((X[j]  l<  0)  -  (P 

-  K; 

P  - 

3)) 

•       Kon  zero 

"PL  Xj  K"  ( 

=  fmi  =  032)  ^  ((X[j]  2  0)  (P 

-  K; 

P  ^ 

3)) 

;      plus  or  position 

"NG  Xj   K"  ( 

=  fmi  =  033)  ^  ((X[j]  <  0)  ^  (P 

K; 

P  ^ 

3)) 

;  negative 

"IR  Xj   K"  ( 

=  fmi  =  0311)  -^  ( 

out  of  range  constant  teste 

-1  ( (X[  J  V39 

'ii>-  im)\l  (X[J]<59:I|8>  14000)) 

-<P  ^  K; 

P  >- 

3)  ; 

"OR  Xj  K"  ( 

=  fmi  =  035)   ^  ( 

(X[j  )<59 

'•8^3777)  V  (x[j  ]<59:'<8^=iiOOO)^ 

(P  • 

-  K; 

P 

3)); 

"DF  XJ   K"  ( 

=  fmi  =  036)  ->  ( 

indefinite  form  constant  tests 

(X[J  ]<59 

'i8>=l777)  V  (X[j  ]<59:li8;^6000) 

-  (P 

-  K; 

P  -3)); 

"ID  Xj  K"  ( 

=  fmi  =  037)  ( 

(X[j]<59 

'i8;=l777)  V  (X[j  ]<59: 1(8^6000)  - 

(P  <-  K; 

p  •- 

3)); 

Jump  on  B  [i  ]j  B  [j  ]  comparison 

"E(J  Bi  Bj  K"  (:=  fm  =  Oh)  ^  ((B[i  ]  =  B|j  ])  (P  ^K;  p  ^3)); 
"NE  Bi  Bj  K"  (:=  fm  =  05)  -.  ( (B  [ i  ]  i*  B  [j  j)  -.  CP  -  K;  p  -  3)>  ; 
"GE  Bi  Bj  K"  (:=  fm  =  06)  ^  ((B[i  ]  2  B  [j  ] )  (P  -  K:  p  -  3) )  : 
"LT  Bi  Bj  K"  (:=  fm  =  07)  ^  ((B[i  ]  <  B[j  ])  ^  (P  ^  K;  p  ^3))  ; 
Subroutine  call 

"RJ  K"  (:=  fmi  =  010)   -  ( 

H[K]<59:30  ^0'4gCD0ga(P  +  DcBOOOOOg;  next 
(P        +  I;  p  ^3)): 

"eading  (REC)  and  writing  (WEC)  Up  with  Extended  Core  Storage,  subjected  to  bounds  checks,  and  Ms ' 
"REC  Bj  +  K"  (:=  fmi  =  Oil)   ^(  read  extended  core 


equal 
not  equal 

greater  than  or  equal 
less  than 

return  jump 


'■ip '  mapping 


500  Part  5  I  The  PMS  level 


Section  4     Network  computers  and  computer  networks 


Hd'  [A[n]:A[0]  +  B[j  ]  +  K- 1  ]  ^  Ms  '  [X  [0  ]:  X  [0  ]  +  B  [  j  ]  +  K-1  ]) 

"V'FC  Bj  +  K"   (:=  fmi  =  012)  ( 

wite  extended  cove 

Ms'  [X[0]:X[0]  +  B[J  ]  +  K-1  ]  -  Hp '  [A  [0  ]:A  [0  ]  +  B  [  j  ]  +  K-I  ]) 

Fixed  Point  Arithmetic  and  Logical  operations  using  X 

"  IX 

Xj  +  Xk"  f  :=  fm  =  36)       (X  [i  ]  *-X  [j  ]  +  X  [k]) ; 

integer  sum 

■integer  difference 

xk"   (  :-  fm  =  ^7)^(X[i}  *-  sumonodu  1  o^2  (X  [k  ])  ; 

count  the  number  of  bits  in  X{k^ 

"BX 

XJ"   (:=  fm  =  lOg)   ^  (X[!  ]  ^  X  [J  ])  ; 

transmi t 

"BX 

Xj  i  Xk"   (:=  fm  =  llg)   -(X[i]  ^X[!]  ^X[J]  A  X[k]); 

togicat  product 

Xj  +  Xk"   ( :=  fm=  12}   ">(X[i]  ^X[j]  vX[k]); 

TogioctZ  sum 

"BX 

Xj  -  Xk"   (:=  fm  =  13)   -  (X[i  ]  -X[J  ]  ®  X[k])  ; 

"BX 

-  Xk"   (:=  fm  =  U)   -  (X[!  ]  ,  ,  X[k]); 

trcinsfTiit  coTnp  Zefnent 

"PX 

-  Xk  -  Xj"  {;=  fm=  15)^    (X[I]  ♦-X[j]A^X[k]); 

logical  product  and  complement 

-  yk  +  Xj"  {:=  fm=  16)— >    (X[i]  *-X[j]v^X[k]); 

logical  sum  and  complement 

"BX 

=  Xk  -  XJ"   (:=  fm  =  17)-.    (X[i  ]  ^X[J  ]  X[k]); 

logical  difference  and  complement 

"LX 

Jk"  (:=  fm  =  20)  -»  (X[i]  .-X[i]  X  2-''*  {rotate)); 

"AX 

Jk"   (:=  fm  =  21)   -  (X[i]  -X[i]  /  2-'''); 

arithmetic  right  shift 

"1  X 

BJ  Xk"   (:=  fm  =  22)   ^  ( 
^1[J]<17>  ^X[i]  -X[k]  X  2^'^-'^^5:0>  [rotate); 
R[j]<17>  -X[i]  ^X[k]  /  r^fJ^<'°^">); 

left  shift  nominally 

"AX 

BJ  Xk"   ( :=  fm  =  23)  ^  ( 
-B[j]<I7>-X[i]^X[k]  /  2^tJ^<|°  =  °>; 
B[j3<17>  ^X[i]  ^X[k]  X  2^  B[j]<5:'^>  i;rotate}); 

arithmetic  right  shift  nominally 

"MX 

jk"          fm  =  i*3)  ^  ( 

form  mask 

X[  i  ]<59:59-jl<+l>  ^  2-'     -  I  ; 

(Jk  =  n)  ^X[i]  .-0); 

Floating  Point  Arithmetic  using  X 
Only  the  least  significant  do)  ravt  of  arithmetic  is  stored  in 

Floating  DP  operations. 

"FX 

Xj  +  Xk"   (:=  fm  =  30)  -  (X[i]  ^X[j]  +  X[k]  [sf]); 

floating  sum 

"FX 

Xj  -  Xk"   (:=  fm  =  31)  ^  {X[i]  -X[j]  -  X[k]  [sf]); 

floating  difference 

"nx 

Xj  +  Xk"   <:=  fm  =  32)  ^  (X[  i]  ^  X[j]  +  X[k]  [is.dfl)  ; 

floating  dp  sum 

"nx 

Xj  -  vk"  (:=  fm  =  33)  ^  (X[i]  -X[j]  -  X[k]  [is.df]); 

floating  dp  difference 

"RX 

Xj  +  Xk"   f  :=  fm  =  3^)  ( 

X[i]  ^round(X[j])  +  round(X[k3)  {sf}); 

"RX 

Xj  -  Xk"   (:=  fm  =  35)  ( 

round  floating  difference 

X[i]  ^round{X[J])  -  round(X[k])  [sf]); 

"FX 

Xj   -.^  Xk"   (:=  fm  =  kO)        (X[I]  H-X[j]  x  X[  k]  [sf]); 

floating  product 

"RX 

Xj      Xk"   f:=  fm  =  k])  ^  ( 

round  floating  product 

X[ i]  ^  X[j]  X  X[k]  {sf];  next  X[ i]  .-  round (X[ i])  [sf]) ; 

"DX 

Xj      Xk"   f fm  =  hi)       (X[  I]  ^  X[j]  X   X[k]  [Is.df])  ; 

floating  dp  product 

"FX 

Xj  /  Xk"   (:=  fm  =  hk)       (X[I]  *-X[j]  /  X[k]  [sf]); 

floating  divide 

"RX 

Xj  /  Xk"  f  :=  fm  =  its)  -  (X[  I]  ^  round  (X[j]  /  X[k])  [sf]) 

round  floating  divide 

"NX 

Rj  Xk"  (:=  fm  =  2M  ( 

normalize 

X[ i]  ^  normal ize(X[k])  {sf) ; 

B[J]  *-  normal i2ewexponent(X[k])  [sf]); 

Chapter  39  ^  Parallel  operation  in  the  Control  Data  6600  501 


"ZXi  BJ 

Xk"  (;=  fm  =  25)  C 

round  and  normalize 

X[!] 

round(X[k])  [sf]  ;  next 

X[i] 

<-  normal ize(X[ i J)   {sf} ; 

B[J] 

norma  1 i  ze^exponen t  CX[i])  [sf]): 

"UX!  BJ 

Xk'    (:=  fm  =  26)  ->  (B[JJ  .-  X[kJ<58:l48>  [si] 
X[i]  <- X[k]<59,'<7:CO  [s 

unpack 

"PX!  BJ 

Xk"   {:=  fm  =  27)  ->  (X[k]<58:l|8>  t-  B[J]  [si] 
X[k]<59.'<7:a>  -  X[IJ  {s 
) 

)) 

pack 

end  Inetruction^jexecution 

502  Part  5  |  The  PMS  level 


Section  4     Network  computers  and  computer  networks 


APPENDIX  2    CDC  6400,  6500,  6600,  AND  6416 
PERIPHERAL  AND  CONTROL  PROCESSORS, 
PCP,  ISP  DESCRIPTION 


Appen 

dix  2 

CDC  6^00,  6500, 

6600,  and  6^]6 

Peripheral 

and  Control   Processors /PCP ,   ISP  Description 

Pa  State 

A<17:0> 

aaaimrulatoT 

P<1 1 :  0> 

ProgTom  Address  Counter 

Hp  State 

M[0:'|095]<1  1  :0> 

Mp 

M  index[0:63]<l  1  :()>:=  M[  0;  6  3]<1  1  :  0> 

special  arrau  in  l'-*p  reserved  for  index  register 

C( 'Central)  State 

CP^P<17;Ci> 

the  main  Pc  instruction  address  counter 

CPM[0:777777g]<59:0> 

the  Mp  of  main  C 

10  Registers  for  CCPCP) 

C„0ATA[0:63]<I1  ■.0> 

dcitd  buffers  at  periphercit  K's 

Cu.ACT[0:63] 

CL  bit  to  denote  i'f  Z  of  the  64  K's  is  acti^ve 

CoFLG[0:63] 

denotes  a  futt  (or  empty)  buffer  at  the  K 

LFCN[0:63]  <1  1  :0> 

fzciction  or  instruction  register  at  a  spea-fz-c  K 

Instruction  Format 

ln5[0:l]<11:Q> 

instruction 

1 onq^I ns  t ruct  ion 

2  w  instruction:  defined  in  terms  of  op  codeSj  see  Tablet  P'^g^ 

shortu 

nstruction   :=  ^ 

1  ong^ji  ns  t  ruct  i  on 

i  w  instruction 

r;5;0> 

=   lns[0]<l 1 :6> 

functton  or  op  code 

*:5:0> 

=  lns[0]<5:0> 

iKH  :0> 

=  lns[l] 

address  part 

dnXl 7: 0> 

=  dOn 

i  <1  1  :  0> 

=   InsLl ]<1 1 :0> 

indirect  bit 

d^sign<l 1 

0>  :=  ( 

-id<5> 

^  Ood; 

d<5> 

d) 

md<l 1 :0> 

=  ( 

(d  =  0)  -1  m; 

(d      0)      m  +  M[d]) 

Effective  Address  Calculation  Process 

z  :=  ((R5:3>  =  3)  ->  d: 

(F<5:3>  =  I|)  ^  I  ; 

(F<5:3>  =  5)  md) 

Instruction  Interpretation  Process 

Run  -.  (lns[0]  ^M[P];  P 

P  +  1 ;  next 

fetch 

1  ong^ji  ns  t  ruct  i  on  ^ 

(InsCU  ^M[P];  P  - 

P  +  I) ;  next 

Instruct lon^execu t 

on) 

execute 

Chapter  39  ,  Parallel  operation  in  the  Control  Data  6600  503 


Implementation 

The  10  X  S2  bite  in  the  barrel  for  the  10  PCP  ISP  include: 
AtO:9)<l7:0> 
P[0:9]<t I :0> 
Temporary  Harcbare  registers  (not  in  the  ISP) 


m 

K[0 
T[0 


9]<n  :0> 

9]<5:0> 

9]<2:0> 


acGimulatora 

instruction  address  counters 

leu  order  6  bits  of  an  instruction  or  address  dats 

six  bits  hold  the  operation  code.  The  3  bits  specify  the 
trip  count  or  state  of  an  instruction's  interpretation. 


XoY 


8'0 
*S  00 


Instruction  execution  :=  ( 

 Oh   05 


PSN  -> 
nufl 


LJM  -  ( 

p.-  md); 


RJM  -  ( 
M[md]  --P; 
P-  md+1); 


UJN  -  (( 


ZJN  "  ( 
(A=0) 


NJN  -  ( 
P.-  P+d„sign)) 


PJN  -  ( 
-nA<17>  -( 


HJN  -  ( 
A<17> 


LMN   -  ( 
A-ASjd)  ; 


LPN  -( 
A<-AAd)  ; 


SCN   -  ( 
A.-A^-id) ; 


LDN  -  ( 
A-d)  ; 


LCN 

A<--d); 


ADN-.  ( 
A-A+d)  ; 


SBN  -.( 
A-A-d)  ; 


LDC  ->  ( 
A<-dm)  : 


ADC  -  ( 
A—  A+dm)  ; 


LPC  ( 
A<-  AAdm)  ; 


LMC   -  ( 
A-A9dm)  ; 


PSN  - 
null 


EXN  -  ( 
CPU>^A)  ; 


RPN   -  ( 
A-CP^P)  ; 


Lno  -  ( 


LDI   -  ( 


AD  I  -( 


SB  I  -( 


ST  I  -( 


AO  I    -  ( 


SOI  -( 


LDM  -  ( 
A-H[z])  ; 


ADH  -  ( 
A.   A+M[z  ]) 


SBM  -.  ( 
A.-A-M[z]) 


LMM  -  ( 
A-A^ltz]); 


STM  - 
M[z] 


RAM  ^  ( 
A^A+M[z  ]; 
next 

M[zj-A); 


AOM   -  ( 
A.-M[z  ]+l  ; 
next 

M[z  ]-A)  ; 


SOM  ( 
A^M[z]-l  ; 
next 

M[z].-A). 


CRD  ( 
CPM[A]) ; 


CRM  ( 
M[m:m+ 
5xM[d]-l ]^ 
CPM[A:A+ 
H[d]-1 ] ) ; 


CWD  -  ( 
CPM[A]- 
M[d:d*5])  : 


CWH  ->  ( 
CPH[A:A+ 
M[d]-I  ]»- 
M[m:m+ 

5xM[d]-l  ]) ; 


AJM  ~  ( 
C^ACTtd]-  ( 


IJM  ( 
-n  C„ACT[d  ]-  ( 

-P-m)); 


FJM  ( 
C-FLG[d]-»  ( 


EJM  _  ( 
^  C„FLG[d  ]„  ( 


70 


IAN  -  ( 

A- 

C  JATA[d])  ; 


I  AM  -  { 
C^fLr,[d]-  ( 
M[m:m+A]^ 

CJ)ATA[d])); 


OAN  -  ( 
C^DATA[d] 
-A); 


0AM  -  ( 

CJ^LG[d]- 
C^OATA[d] 

-M[m:M-A]l)  ; 


ACN  -  ( 
C^ACT[d  ] 
-I )  : 


DCN  -.  ( 
C^ACT[d: 


FAN  ^  ( 
C^FCN[d  ] 
■A); 


FNC   -  ( 
C^FCN[d  ] 


Inetvuction ^xecut ion 


1  word  or  short ^instrmat ion 


Chapter  40 


Computer-network  examples 

We  are  just  entering  the  era  in  which  general-purpose  networks 
of  computers  make  technical  and  economic  sense.  The  requisite 
hardware  and  software  development  of  operating  systems  and 
multiprogramming  capability  is  still  maturing.  Thus,  unlike  the 
other  PMS  structures  discussed  in  this  book,  there  is  no  supply 
of  operational  systems  with  published  descriptions  upon  which  we 
can  draw.  Consequently,  we  have  assembled  several  brief  examples 
of  networks  to  provide  at  least  some  illustrations  of  what  is  sure 
to  be  an  important  aspect  of  computer  systems  in  the  near  future. 
The  more  interesting  of  these  examples  are  still  in  the  planning 
stages;  those  that  exist  currently  are  still  highly  specialized. 

Spatially  distributed  intercommunicating  networks  of  digital 
devices  have  existed  for  a  long  time.  But  many  of  the  ones  that 
come  most  easily  to  mind  are  not  computer  networks.  For  example, 
the  various  airline  reservation  systems  like  American  Airline's 
SABRE  [Plugge  and  Perry,  1961]  have  spatially  distributed  termi- 
nals (T's)  with  a  single  Pc,  possibly  mediated  by  Pio's  or  Cio's. 
When  there  are  several  Pe  s,  they  are  functionally  integrated  so 
as  to  provide  the  total  capacity  and  reliability  needed.  Some 
military  networks,  such  as  the  SAGE  Air  Defense  System  [Everett 
et  al.,  1957]  have  multiple  computers  (SAGE  actually  has  a  very 
large  number).  But  they  transmit  to  each  other  highly  specialized 
data  streams  (for  example,  aircraft  positional  information  for  con- 
trol). The  National  Physics  Laboratory  of  England  has  made  a  very 
comprehensive  proposal  for  a  general-purpose  network  [Davies  et 
al.,  1967],  although  we  do  not  include  it  as  a  chapter.  Again,  it 
is  just  in  the  proposal  stage.  The  Lawrence  Radiation  Laboratory 
(at  Livermore)  is  no  doubt  the  earliest  and  most  impressive  net- 
work. 

In  terms  of  our  PMS  descriptions,  a  computer  network  (N) 
requires  at  least  two  C's  not  connected  through  primary  memory. 
Thus  each  C  has  a  Pc  and  an  Mp  of  its  own  and  has  to  communi- 
cate with  other  C's  through  messages.  Duplex  computers  are  thus 
defined  as  networks,  provided  they  do  not  share  Mp.  For  networks, 
links  (L's)  are  usually  shown  explicitly.  In  spatially  distributed 
systems,  both  the  time  delays  and  the  flow  rates  of  the  links  are 
significant.  The  latter  is  so  partly  because  the  networks  must  make 
use  of  the  telephone  communication  system,  which  exists  inde- 
pendently of  the  networks,  thus  having  parameters  that  do  not 
correspond  with  any  of  the  internal  parameters  of  the  individual 
computers.  There  may  also  be  limitations  of  reliability,  cost. 


accessing  characteristics,  and  the  size  of  the  information  unit  that 
derive  wholly  from  the  links.  For  instance,  many  computer  net- 
works would  like  to  buy  their  transmissions  from  the  telephone 
system  for  very  short  intervals  (milliseconds),  at  very  high  data 
rates,  and  with  short  switching  time  (milliseconds),  i.e.,  bursts. 
Switching  time  and  pricing  policies  within  the  telephone  system 
conspire  to  make  this  a  difficult  thing  to  do.  Thus,  with  networks, 
links  become  important  independent  components. 

One  classification  of  networks  (N's)  is  by  fixed  or  variable 
interconnection  structure.  Fixed  structure  may  mean  that  the  links 
are  fixed  permanently  over  the  life  of  the  network.  However,  fixed 
structure  may  mean  only  that  connections  once  made  must  be 
held  for  long  periods  of  time  relative  to  the  message  flows.  An 
example  is  the  telephone  switching  system  mentioned  above, 
which  looks  like  a  variable  switching  structure  at  the  level  of 
human  conversations,  but  like  a  fixed  .switching  structure  at  the 
level  of  computer  conversations.  Figures  la  and  Ic  show  variable- 
structure  systems;  Fig.  lb  shows  a  fixed-structure  system.  In  the 
former,  any  C  can  talk  directly  to  any  other  C.  In  the  latter,  each 
C  talks  directly  to  only  a  few  C's;  thus,  to  communicate  with  the 
other  C's,  it  must  transmit  through  them  as  links;  that  is,  it  must 
use  another  C  as  an  L. 

A  second  classification  of  N's  is  by  the  nature  of  the  delays 
sufl^ered  by  the  messages  as  they  travel  from  an  initiating  C  to 
a  target  C.  Communication  can  be  direct,  in  which  case  the  only 
delays  are  those  through  the  switches  (S)  and  links  (L)  between 
the  two  C's  (Figs,  la  and  Ifc).  Alternatively,  communication  can 
involve  storing  messages  at  intermediate  nodes  (called  store-and- 
forward  communication),  thus  introducing  additional  memory 
delays  into  the  communication  but  decreasing  the  demands  for 
coordination  between  the  two  C's.  Although  store-and-forward 
systems  can  be  built  with  the  intermediate  nodes  being  K's  with 
buffer  memories,  in  the  present  context  the  natural  form  for  such 
a  system  uses  the  other  C's  in  the  system  as  the  intermediate  nodes, 
as  in  Fig.  Ic. 

Several  kinds  of  reasons  can  justify  the  existence  of  a  particular 
network.  The  following  list  is  adapted  from  Roberts  [1967]: 

Load  sharing.  A  problem  (program  and  data)  initiated  at  one  C 
that  is  temporarily  overloaded  is  sent  to  another  for  processing. 
The  cost  of  transshipment  must  clearly  be  less  than  the  costs  of 


504 


Chapter  40  j  Computer-network  examples 


Fig.  la.  Variable-structure  direct  switching  network  PIVIS  diagram. 

delay  in  getting  the  problem  processed.  Load  sharing  implies 
highly  similar  facilities  at  the  nodes  of  the  network. 

Data  sharing.  A  program  is  run  at  a  node  that  has  access  to  a  large, 
specialized  data  base,  such  as  a  specialized  automated  library.  It  is 
less  costlv  to  bring  the  program  to  the  data  than  to  bring  the  data 
ti)  the  program. 

rrogrcim  shaiing.  Data  are  sent  to  a  C  that  has  a  specialized 
program.  This  might  happen  because  of  the  size  of  the  program 
(hence,  fundamentally  the  same  reason  as  data  sharing),  but  it 
might  also  happen  becau.se  the  knowledge  (i.e.,  initialization  and 
error  rituals)  to  run  the  program  is  available  at  one  C  but  not 
at  another. 

Specialized  facilities.  Within  the  network  there  need  e.vist  only 
one  of  various  rarely  used  facilities,  such  as  large  random-access 
memories,  or  special  display  devices,  or  special-purpose  array 
processors. 


Fig.  lb.  Fixed-network  PMS  diagram. 


C  .Cs— L  C 

L  L'  L 

\    /  \ 

Cs    L   Cs—  L — : 

/\  ! 
/      \c  ! 


Fig.  Ic.  Store-and-forward  network  PMS  diagram  (using  C  switching). 

Message  switching.  There  ma\'  be  a  communication  task  of  such 
magnitude  that  sophisticated  switching  and  control  are  worth- 
while. 

Rcliahilitij.  If  some  components  fail,  others  can  be  used  in  their 
place,  thus  permitting  the  total  system  to  degrade  gracefully,  (."^t 
the  present  state  of  the  art,  peripheral  computers  are  needed  to 
isolate  the  periphery  from  the  unreliabilit\  of  the  network,  and 
vice  versa.) 

Peak  computing  power.  Large  parts  of  the  total  system  can  be 
devoted  for  short  periods  to  a  single  task,  if  there  are  important 
real-time  constraints  to  be  met.  This  depends  on  being  able  to 
fractionate  the  task  into  independent  subtasks. 

Communication  multiplexing.  Efficient  use  of  communication  fa- 
cilities is  obtained  bv  multiplexing  a  number  of  low  data-rate 
users,  for  example,  T(typewriter;  150  b/s)'s.  This  may  not  be  a 
reason  for  a  network  per  se  but  may  justify'  a  larger  network, 
provided  that  there  is  some  reason  for  having  one  in  the  first 
place. 

Better  communication.  .\  community  of  users  (e.g.,  a  scientific  or 
engineering  community)  that  could  mutually  use  the  same  pro- 
grams and  data  bases  and  converse  about  these  directly  (i.e.,  not 
by  writing  about  them  but  in  the  context  of  mutual  use)  might 
become  a  much  more  productive  community,  with  less  duplication 
of  work,  faster  communication  of  results,  etc. 

Better  load  distribution  through  preprocessing.  Some  tasks  require 
very  high-data-rate  communication  with  a  computer.  By  doing 
preprocessing  in  a  smaller  computer,  a  reduced  information  rate 
can  be  sent  to  the  more  general  system. 

With  this  general  view  of  networks,  let  us  consider  several 
examples. 


Part  5  I  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


IBM  ASP  (Attached  Support  Processor) 

This  first  example  (Fig.  2)  is  the  simplest  of  all  computer  networks, 
consisting  of  two  computers  tied  together,  with  each  functionally 
specialized  (and  in  addition  required  to  be  physically  close).  The 
fiuiction  of  C. support  is  job  setup  and  breakdown,  that  is,  pre- 
processing and  postprocessing.  All  T's  for  the  network  are  handled 
by  it  (except  for  T. console  on  C.main).  The  fimction  of  C.main 
is  to  process  data.  Thus  this  is  an  escalated  version  of  the  Pc-n 
Pio  organization,  where  the  Pio  s  have  been  made  into  a  C. support 
and  thus  can  take  on  additional  functions.  It  should  be  compared 
with  the  CDC  6600  organization,  which  is  C.main-10  Cio,  but 
where  the  Cio's  are  rather  small  Cio(4096  w;  12  b/w)  compared 
with  the  C.support.  The  ASP  organization  is  the  360  analog  of 
a  system  consisting  of  an  IBM  7090-IBM  7040  which  emerged 
spontaneously  in  the  early  sixties  at  several  IBM  installations  in 
order  to  deal  with  7090  I/O  bottlenecks.  Thus  this  kind  of  simple 
computer  network  has  been  with  us  for  some  time. 

In  more  detail,  the  advantages  that  are  chiimed  for  ASP  are 
in  reducing  resource  interference:^ 

'AdaptecJ  from  IBM  Systeni/36(l  Attached  .Support  Processor  (ASP)  System 
Description,  H20-0223-0. 


C('Main)  := 
|p((.25  -  I)  megabyte) 

Pio.  .  .  PcClBM  Systeni/360  Model  ^5,  75) 

M5(disk)...  Ms(magnetic  tape)...  Ms(drum)...  T 


C('Support)  := 


MsCdisk) . 

. .  Hs (magnet  i  c  tape) .  .  .  T 

1  1 

Pio  .  .  . 

Mp((.l  ~ 

. 5) megabyte ) 

Pio. . . 

PcClBM  System/360  Model       ,  SO) 

T(card) . . 

1   1 

T{line;  printer)...  T{typewriter) 

i  1 

Fig.  2.  IBM  System /360  Attached  Support  Processor  system /ASP 
PMS  diagram. 


1  The  addition  of  smaller  modules  of  Mp  in  the  form  of  a 
second  processor.  The  processing  of  the  application  is  di- 
vided between  the  main  processor  and  the  support  proces- 
sor, with  each  performing  those  functions  for  which  it  is 
best  suited.  The  core  requirements  for  the  support  processor 
are  small  in  comparison  with  those  for  the  main  processor. 
With  this  division  of  responsibilities,  the  system  can  expand 
its  capabilities  with  a  minimum  addition  of  storage. 

2  The  elimination  of  concurrent  use  of  Pc  time  on  the  main 
processor  for  processing  support  functions  (such  as  printing). 
Because  the  clerical  fimctions  are  assigned  to  the  support 
processor,  the  main  processor  no  longer  shares  Pc  time 
between  the  support  functions  and  the  application  pro- 
grams. Therefore,  the  application  has  the  opportunity  to 
use  all  the  resources  of  the  main  processor  to  full  capacity. 

3  The  addition  of  selector  channels.  The  channel  capacity  of 
the  system  has  been  increased  by  one  or  more  additional 
selector  channels  attached  to  the  support  processor. 

4  An  algorithm  for  efficient  management  of  the  direct-access 
storage  devices  for  system  input/output  data  sets.  The 
algorithm  was  designed  specifically  to  accommodate  the 
data  demands,  the  data  .set  characteristics,  and  the  available 
private  devices.  The  input/output  routines  always  know  the 
position  of  the  access  mechanism,  thereby  ensuring  mini- 
mum seek  time  when  data  are  transferred  to  the  devices. 

IBM  cites  the  above  reasons  for  using  the  ASP  system.  These 
views  differ  from  ours  on  its  usefulness.  Ideally,  a  multipro- 
grammed  single-processor  or  multiprocessor  structure  would  easily 
provide  all  the  above  advantages  without  the  overhead  of  having 
large  Mp  s  on  two  computers  (both  of  which  hold  nearly  the  same 
operating  .system).  Also,  as  we  note  in  the  introduction  to  the 
System/36()  (page  .584),  the  support-computer  fimctions  can  be 
handled  in  the  main  computer  with  very  little  loss  of  large  Pc 
power  (3  to  10  percent).  A  multiprocessor  structure  should  also 
cause  less  overhead,  bv  not  passing  data  .sets  between  two  C's. 
(Alternatively,  in  ASP  this  could  be  done  by  an  S  to  common  Ms 
from  both  C's.) 

University  of  Texas  network 

The  structure  shown  in  Fig.  3  is  similar  to  ASP  in  that  a  C.main 
is  used,  with  some  job  setup  and  breakdown  being  done  in  several 
other  C's.  However,  there  are  several  of  these  C's,  and  they  provide 
independent  power  for  small  tasks  where  the  setup  time  for  the 
large  system  is  greater  than  the  computation  time.  They  are  also 
physically  remote  from  C.main  and  thus  serve  to  make  the  power 
of  the  central  facility  available  at  local  sites.  The  Teletypes  are 


Chapter  40     Computer-network  examples  507 


T (Te letype) . . , 

S('TeIephone  Exchange) 

:(':DC  6600;  computation  Center) 
S('Telephone  Exchange) 

Ll   C('CDC  1700;  Linguistic  Research  Laboratory)- 

'  L    CCCDC  3100:  College  of  Business  Administration)- 

'  L    C('8231  Computer  Termi nal )-» 

I  T(card)- 
T  ( 1  i  ne  ;  pr  i  nter )-» 
-L(to:     other  C's  off  campus)- 


Fig.  3.  The  Computation  Center,  University  of  Texas,  (Austin)  Network 
PMS  diagram. 

used  to  enter  jobs  directly  to  the  C.iiiain,  where  they  are  run  in 
a  batch  mode. 

The  network  of  Fig.  .3  is  that  at  the  University  of  Te.\as,  as 
derived  from  its  internal  planning  memoranda.  Similar  systems  are 
in  existence  or  under  construction  at  other  universities. 

M.I.T.  proposed  network 

Figure  4  shows  a  network  that  is  proposed  for  the  M.I.T.  campus 
[Bhushan,  Stotz,  and  Ward,  1967].  It  moves  to  a  more  complex 
switching  system,  partly  because  there  are  two  C. main's.  Here 
an  S(direct)  is  used  in  a  non-store-and-forward  mode  as  each  C 
communicates  directly  with  another.  The  communication  rate 
between  C's  is  40  ~  2.30  kb/s.  (Note  that  at  higher  data  rates  a 
fairly  large  computer  is  necessary  just  to  handle  the  store-and- 
forward  message  switching  information  rates.)  The  purpose  of  the 
network  is  to  allow  users  of  the  small  or  terminal  C's  to  get 
access  to  C('IBM  .360/67)  and  C('GE-645).  These  two  C's  can, 
of  course,  communicate  with  one  another.  A  large  number  of 
users  are  connected  to  T(tvpewriters)  via  the  S( 'Telephone  E.\- 
change). 

The  Lawrence  Radiation  Laboratory  {at  Livermore)  network 

The  LRL  network,  started  in  1964.  appears  to  be  the  earliest 
general-purpose-computer  network.  It  serves  a  user  population 
of  approximately  1,000,  with  several  hundred  simultaneous  on 
line  users.  The  network  consists  of  five  large  computers  (three 
CDC  6600s  and  two  CDC  7600s),  a  switching  computer  (a  DEC 


PDP-6  with  two  Pc's  and  a  262  kword  .\lp  and  a  10''-bit  fi.\ed- 
head  disk  for  fast-access  files),  three  terminal  control  com- 
puters (DEC  PDP-8's),  and  a  large  central  file  (a  iC'-bit  IBM 
Photostore  controlled  by  an  IBM  1S(K)  computer).  Hardwired  4 
megabit  per  second  links  connect  the  large  computers  to  the 
switching  computer.  The  terminal  computers  and  the  large  file 
are  also  connected  to  the  switching  computer. 

The  main  purpose  of  the  network  is  to  gain  access  to  the 
central  filing,  printing,  and  terminal  facilities.  Load  sharing  is  not 
an  important  consideration  because  each  of  the  large  computers 
operates  nearlv  aiitonomouslv.  Thus  little  change  was  required  in 
each  system  to  be  integrated  to  the  network.  Jobs  enter  the  net- 
work in  any  of  three  ways — by  the  batch  input  terminals  of  a 
large  computer;  by  the  typewriter  inputs  of  a  large  computer; 
or  bv  the  typewriter  inputs  of  the  terminal  control  computer 
which  in  turn  connects  to  the  central  switch.  Unlike  most  uni- 
versity computation  centers,  which  provide  service  for  many 
users  with  small  jobs,  the  LRL  network  is  oriented  to  users  with 
I  multiple  1  large  jobs. 


Fig.  4.  M.l.T.-network  PMS  diagram  (proposed). 


Tltypewr  i  ter  ,Tel  etypen  . 
I  [jO  ~  15  char/5  J 


T('Dataphone) 


Tlstorage  CRT;  display: 
[^keyboard 

ataphone:     (1  .2  ~  " 
kb/s 


TpData 
lb' -8) 


1  r 


T  ( ' Dataphone) ...           T  ('Dataphone) . . . T (' Dataphone) . 
Cp  IBM  System/360~|       Ct'lBMASP]  C('GE61i5) 
I  L"odel  67  J       I  , 


N  pAdvanced  Research 
[Projects  Age 


-i  r 


['Interface  Message! 
Processor  J 

search  "1 
ncy/ARPAj 


C('Satellite)... 
T(CRT;  display). 


^SCTelephone  exchange:  (10~  15)  char/s.(1.2~  li.8)  kb/s) 
^SCWideband  Communications  Center;    CtO.R-  230.4)  kb/s) 


508  Part  5  |  The  PMS  level 


Section  4  |  Network  computers  and  computer  networks 


C(f  i  le)- 


I   

C(f i le)  


duplexed  file 
C ' s  sha  ri  ng  a 
common  secondary 
memory  for  long 
term  f i 1 ing 


ma  I n  processors 
wi  th  seconda ry 
memory  (Ms) 


^5(50  ^  180  b/sec)^ 

^S(600  ~  itSOO  b/sec)^ 

^5(^*0  ~  50  kb/sec)^ 

*S{200  '-  2000  kb/sec;  fixed) 

p"!  xed  ,{ 'Telephone  Exchange;  d  I  rect ) ,  (C  (s' 
Kdi rect)  ]  (store  and  forward/sf)) 

°  E .  g  .  ,  a  I  i  brary 


high  speed  message 
concentrators ,  | 
spec  i  a  I   sys  tems , 
store  and  forward/sf  I — S^- 


-C(sf:  fisf 


1 


C  ' T  i  me  Sha  red  ;  I 
[ms  J 


message  concen- 
I  t  rators ,  spec  i  a 
I    systems ,  store 

and  forward/sf 

I 


r(CRT;  console)- 


— j-    C  — 5  — X 
I — C-S— T(CRT:  console)- 

I  ; 

I — C(T(card,   lines,  analoq,  plot)) 


-L{200  ~  2000  kb/s) 
L  do  ~  50  kb/s) 
L  (600  ~  hflOO  b/s) 
L(50  ~  180  b/s) 


-T (card ,   1 i  ne ,  p 1 ot ) - 


—  Tj  storage  CRT,  display 71 - 
[^keyboard,  console  J 

sij  ftypewr  i  ter  ,  TeletypeT]- 
[telephone  dial  J 


network  periphery 


Fig.  5.  Typical  computer  network  PMS  diagram. 


Typical  local  neticork 

We  summarize  in  Fig.  5  the  direction  in  which  the  last  three 
networks  are  moving  bv  presenting  a  hypothetical,  local  network, 
as  it  may  mature  on  many  large  university  campuses  (and  large 
industrial  establishments).  The  network  is  conceived  as  a  single 
computing  facility,  to  serve  a  clientele  with  many  heterogeneous 
but  partially  overlapping  computing  needs.  An  essential  feature 
of  the  environment  of  the  network  is  that  the  collection  of  com- 
puting resources  it  connects  are  not  planned  all  at  once  but  keep 
growing  and  changing  in  imperfectly  controlled  ways.  This  arises 
from  the  quasi-independent  nature  of  the  subparts  of  large  uni- 
versities and  engineering  establishments.  In  any  event,  the  network 
is  a  mixture  of  fimctionally  independent  and  functionally  special- 
ized C's.  One  probable  feature  is  the  duplexed  C. files  which  handle 


all  the  Ms  functions  for  all  C's,  except  the  C(library).  A  library's 
computer,  though  strongly  coupled  to  the  network,  would  have 
its  own  files  and  specialized  terminals,  including  hard  copy  devices 
oriented  to  library  needs.  The  C.file  increases  the  requirements  for 
the  S. central  but  provides  much  more  economic  Ms,  as  well  as 
easing  the  ability  to  connect  new  C's  into  the  system,  since  they 
immediately  have  access  to  an  organized  Ms. 

The  reader  should  note  that  the  four  switches  (S's)  can  be  either 
fixed  links,  variable  switches  (e.g.,  Telephone  Exchange),  or  a 
computer  used  as  a  direct  switch  or  as  a  store-and-forward  switch. 

The  most  interesting  aspect  of  this  network  is  that  it  has  a 
general  hierarchical  structure  and  is  like  other  hierarchical  organi- 
zations. Here,  the  levels  of  the  organization  are  based  on  data 
rates.  For  e.xaniple,  there  is  a  very  low-level  computer  which  deals 
with  the  basic  communication  to  typewriters  at  ~  150  b/s.  This 


Chapter  40  j  Computer-network  examples  509 


C  switch  concentrates  several  typewriters  into  a  time-multiplexed 
2,400-b/s  link.  Several  of  the  2,400-b/s  links  can  in  turn  be  con- 
centrated prior  to  transmitting  via  a  50-kb/s  link.  Thus  the  general 
organizing  principle,  like  that  of  most  large  organizations,  is  to 
handle  problems  at  the  lowest  (cheapest)  possible  level.  Another 
organization  principle  of  the  hierarchy  is  that  only  relevant  infor- 
mation be  passed  between  the  levels.  For  example,  encoding  would 
be  used  so  that  only  some  fraction  of  the  bits  flowing  at  the 
periphery  would  enter  the  highest-level  computers.  At  each  of  the 
levels  we  assume  that  specialized,  time-shared  computers  are 
employed  to  handle  the  very  simpler  tasks  of  editing,  simple 
calculations,  etc. 

At  the  network  periphery  there  are  a  number  of  terminal 
computers,  i.e.,  C(terminal;  CRT,  card,  lines,  analog,  plot,  key- 
board). Although  they  are  computers,  they  behave  as  terminals. 
The  DEC  338  (Chap.  25)  is  typical  of  this  terminal  class.  Part  of 
the  periphery  connects  to  other  networks  and  part  connects  to 
specialized  processes,  e.g.,  a  process  control,  or  experimental 
apparatus  on  a  dedicated  basis.  The  peripheral  computers  are  able 
to  do  local  tasks  independently  of  the  larger,  more  unreliable 
computers. 

Combat  Logistics  Network /ComLogNet 

ComLogNet  was  developed  for  the  U.S.  Air  Force  in  the  early 
1960s  for  the  purpose  of  sending  messages  (or  information)  among 
T's  [Segal  and  Guerber,  1961].  It  is  built  to  transmit  both  at  low 


N( 'ComLogNet)  := 


TCSubscriber  Sta  t  i  on/SS)'"  S 


'ComLogNet;  variable 
structure:  store  and  forward/ 
sf 


1 T('Subscriber  Stat!on/SS)  := 

{T (Te I etype 1 ' Compound^  I  'Magnetic  Tape  Terminal 
^  See  Figure  6c. 
T { ' Compound)  := 
75,150, 

M.bu 


300.600  b/s 


 S — I — T(card;  reader) 

bufferUT(card:  punch 
LT('Teletype)- 
"THMagnetic  TapeT  :=     -Lp  200  ,  2llOo7| 
[Terminal  J  b^OO  b/s  J 


K — M5(magnetic  tape)  - 
Ms. buffer 


H( 'ComLogNet) 


T  ' Subscr  i  ber 
Station/SS 


'5(ComLogNet) 


•T(Teletype) 


'Magnet  ic 
Tape 
Termi  na  1 


T ( ' Compound) 


' Swi  tch  i  ng 

Center/SC 

#1:5 


'rommun i cat  ions 
Data  Processor 
CDP:  fl:'' 


C  'Tape  Sea 
Uni t/TSU 


' Accumu 1  at  ion 
and 

Distribution/ 
LADU:  ,«1  :'' 


Fig.  6b.  Combat  Logistics  Network/ComLogNet  component  relationships. 


N( 'ComLogNet)  := 


SC  ComLogNet) 


iNCSwitching  Center/SC)  See  Figure  6d . 
=^T( 'Subscri  ber  Station/SS)  See  Figure  6a. 


Fig.  6a.  Combat  Logistics  Network/ComLogNet  PMS  diagram. 


Fig.  6c.  S('ComLogNet)  PMS  diagram. 


510  Part  5  I  The  PMS  level 


Section  4     Network  computers  and  computer  networks 


(10  char/s)  and  medium  (1,200  4,800  b/s)  speed,  as  shown  in 
Fig.  6a.  In  this  regard  the  network  is  simply  a  message  switch  for 
the  three  terminal  types.  It  employs  C's  for  the  switching  elements 
and  is  fimdamentally  a  store-and-forward  system.  Had  it  not  been 
for  security,  reliability,  response  time,  and  other  considerations, 
it  would  have  been  possible  to  construct  an  equivalent  system 
using  standard  lease  wire  switches  (or  telephone  exchanges).  In  Fig. 


Fig.  6d.  ComLogNet  N('Switching  Center/SC)  PMS  diagram. 


6b  a  tree  is  used  to  present  the  relationship  of  constituent  members 
of  ComLogNet.  From  it  we  see  that  at  the  first  level  ComLogNet 
has  just  a  switch,  links,  and  terminals  (as  shown  in  Fig.  6a).  The 
network  s  switch  employs  five  specialized  N('Automatic  Electronic 
Switching  Centers/SC)'s  which  communicate  among  each  other 
(Fig.  6c).  Terminals  connect  to  the  individual  N('SC)'s  and  mes- 
sages are  routed  between  two  T's,  either  by  a  store-and-forward 
process  within  N('SC)  or  among  two  N('SC)'s. 

The  individual  N('SC)'s  are  located  at  five  specific  locations  and 
consist  of  fixed  computer  configurations  of  five  to  seven  C"s.  The 
structure  of  N('SC)  (Fig.  6d)  is  formed  basically  by  a  duplex  C 
structure  which  handles  most  processing.  Attached  to  the  two 
CCCommunications  Data  Processor/CDP)  are  two  to  four  C('Ac- 
cumulation  and  Distribution  Unit/ADU)  which  handle  communi- 
cation-link processing.  A  C('Tape  Search  Unit)  is  used  off  line  to 
process  data  from  Ms(magnetic  tape).  The  structures  of  C('CDP), 
C('Tape  Search  Unit),  and  C('ADU)  are  defined  within  Fig.  6rf. 

ARPA  network^ 

An  experimental  computer  network  (Fig.  7a)  is  operational  and 
connects  19  computer  facilities  associated  with  the  contractors 
of  the  Information  Technology  Branch  of  the  Advanced  Research 
Projects  Agency  (ARPA).  These  contractors,  all  of  whom  are 
engaged  in  advanced  research  in  computer  science  and  technology, 
form  a  community  in  which  to  attempt  a  general-purpose  network. 
Since  several  of  the  nodes  in  this  network  (e.g.,  M.I.T.;  see  Fig. 
4)  will  themselves  be  constructing  networks  at  their  own  sites,  the 
system  has  faced  a  good  many  of  the  design  problems  associated 
with  such  a  network.  Unlike  many  of  the  other  networks  discussed 
in  this  chapter,  the  ARPA  network  consists  of  sites  that  are  physi- 
cally remote,  that  are  each  developing  as  total  systems  under 
independent  management,  and  that  have  no  agreed-upon  func- 
tional specialization  vis-a-vis  each  other.  Furthermore,  the  uses 
that  each  node  will  make  of  other  nodes  will  be  the  fairly  general 
ones  cited  at  the  beginning  of  this  chapter,  as  generated  by  a 
general  scientific  community.  Since  many  of  the  institutions  that 
will  be  tied  in  are  major  academic  institutions,  diversity  will  be 
guaranteed.  The  motivation  behind  the  experiment  is  to  reveal 
and  begin  to  solve  the  technical  problems  of  such  general  net- 
works, while  also  discovering  which  of  the  several  advantages  of 
using  networks  listed  earlier  (or  others  unmentioned)  emerge  as 
important. 

'The  Specific  links,  sites,  etc.,  change  with  time;  thus  the  actual  structures 
we  present  are,  by  the  nature  of  the  experiment,  almost  guaranteed  to  be 
in  error. 


C(CommunIcatIons.  Data  Processor/CDP)  := 


r" 

Mp{#0:3)i  S  — Pc- 

C(CDP)  := 


Mp(#0:3)— S- 


:k  

-K(#1  :'l)- 


-  T .  consol  e  - 
.T(prlnter) 

. T(paper  tape;  reader)^ 

-T. console  - 
-T(printer)^ 
.T(paper  tape;  reader)*- 

-  Ms(#l :3;  drum) 
■Ms{#l:'i8;  magnetic  tape)- 

L(to:C{'External))- 
T(I  Ine;  printer)  -> 
T('system  console)^ 
C('ADU)^- 


C('Tape  Search  Unit)^ 

iMp(core;  1.5  p-^/w;  8192  w:  56  b/w) 
=C('Tape  Search  Unit/TSU  := 


Mp  (1)096  w;   18  b/w)— S- 


-T{con5ole)- 


K— S— M5(#l:2;  magnetic  tape)- 
K— S— T(printer)^ 


( 'Accumulation  and  D i s t r i bu t ion/ADU) 


Mp 


'Data  store; 
15  M.s/w; 
«192  w; 
2h  b/w 

' Procedure ; 
5  ns/w; 
756  w;  96 
b/w  function: 
aode  transla- 
tion 


S  — K 


#1:25;  Clow 
speed;  0  ~  601 
b/s)    I  ('high 
speed)  ;  601 
4800  b/s) 
 L 


"to : C ( ' Commun  r  ca- 
tions  Data  Proces- 
-sor) 


*Link;  communications  lines 


Chapter  40  j  Computer-network  examples  511 


CPu  Cal  ifornia 
[Berkeley  J  ^ 

'  Stanford 
Research 
I ns  t  i  tute 


n  NCStanford  U). 

J  A 


Utah) 


'  Systems 
Development 
Corporat  Ion 

M'RAND  Corporation) 


.OT'U  Cal  ;  forniaT) 
I  Santa  Barbara  J 

CpU  Cal  i  forniaT] 
[_Los  Angeles  J  


jP  Ca  rneq  i  e 
[Mellon  U 


e]  cfpolt 

J  Land 


N ( 'U  1 1 1 inois) 

C  1  'Wash i ngton  ~| 
b.  St.  LoulsJ 


■ M( 'U  Michiqan )- 


C^'Dartmouth  Colleqe) 


,  Beranek, 
Newman  _J 


'HIT 

L  i  ncol n 
Laboratory 


'ARPA 

Headquarters 
Washington,  D.C. 


C ( ' Harvard) 
N{'HIT) 

Bel  1 
Telephone 
Laboratory 


Fig.  7a.  Advanced  Research  Projects  Agency  (ARPA)  network  PMS  diagram  (tentative). 


C('Local)  := 
C ( 'Host)- 


- C (' Interface  Message  Processor/ IMP) 
L(!|0.8  kb/s;  to:N('ARPA)) 


Fig.  7b.  Advanced  Research  Projects  Agency  (ARPA)  local-computer 
PMS  diagram. 

Technically,  the  goals  of  the  network  are  (1)  to  make  a  user 
(T)  at  any  site  behave  as  though  it  were  a  T  at  another  site  and 
(2)  to  let  a  C  at  anv  site  use  a  C  at  another  site  for  load,  program, 
and  data  sharing.  To  each  site  has  been  added  a  special  C('Interface 
Message  Processor/IMP).  The  C('IMP)  has  been  designed  by  the 
creators  of  the  network,  and  it  provides  the  conimunalitv  that  will 
permit  the  network  to  function.  One  constraint  in  the  network 


design  is  to  make  only  small  pertiiri)ations  to  the  larger  host 
comptiters.  The  C('IMP)  is  responsible  for  network  messages 
among  other  nodes  (i.e.,  to  their  C('IMP)'s)  and  for  the  interface 
between  the  network  and  the  C  (or  .\)  at  the  local  site.  The  local 
computer  C('Host)-C('L\IPl  interface  is  shown  in  Figs.  ~h  and  7c 


N('Local)  := 

cf 


Cpl:(n-I):T 
L'  Hos  t  J 


C(*n:  'Host)' 


^C ( ' I nterface  Message  Processor/IMP) 
'I 

L  (llO.B  kb/s  ;   to;N(  'ARPA)) 


Fig.  7c.  Advanced  Research  Projects  Agency  (ARPA)  local-computer- 
netw/ork  PMS  diagram  (tentative). 


X(#l  :3)2 

|('Lodi,  California)' 

S('Mojave,  California)- 
X(#l:3)'' 


X(*l  -.kf 

3 ( ' L i t t 1 e ton ,  Massachusetts)' 
' Wi 1 1 i ams town ,  Kentucky)" 


iS(manuaI;50  kb/s;    'Telephone  Switching  Centers) 
2 X  {C  { ' local ) I N ( ' local ) )  These  N  or  C  may  communicate  directly 
with  one  another  or  by  using  more  L's  can  communicate  via  the  S's. 


Fig.  7c/.  Advanced  Research  Projects  Agency  (ARPA)  fixed  switching  centers  PMS  diagrams  (tentative). 


512  Part  5  I  The  PMS  level 


Section  4     Network  computers  and  computer  networks 


for  a  local  computer  and  local  network  cases,  respectively.  The 
C('IMP)  is  a  CCHoneywell  516;  16  b/w;  12  —  16  kw;  1  fis/v/)  with 
capability  to  connect  to  four  to  six  links  at  a  50-kb/s  data  rate. 

The  ARPA  network  leases  a  set  of  fixed  links,  L(50  kb/s). 
These  emanate  fiom  four  S. fixed,  as  shown  in  Fig.  7d.  Thus  the 
fixed  links  between  the  various  sites,  as  shown  in  Fig.  7a.  are 
composed  of  the  links  in  Fig.  Id.  For  example,  the  L(Carnegie- 
Mellon  University;  Bolt  Beranak  and  Newman)  goes  from  Carnegie- 
Mellon  University  in  Pittsburgh,  Pa.,  to  Willianistown,  Ky.,  to 
Littleton,  Mass.  (on  one  of  the  two  links)  to  Bolt  Beranak  and 
Newman  in  Boston,  Mass.  The  other  L(Littleton;  Williamstown)  is 
part  of  L(University  of  Michigan;  Lincoln  Laboratory).  With  such 
a  fixed-link  system  the  network  must  operate  in  a  store-and-forward 
fashion,  with  C('IMP)'s  at  each  site  carrying  out  this  function.  Thus 
the  C('IMP)  is  required  at  each  site,  since  there  is  no  uniformity 


in  the  other  C's  that  are  at  a  site  and  no  control  over  their 
operation. 

Conclusions 

We  feel  the  network  is  the  most  important  computer  structure 
in  the  book.  Through  understanding  it,  we  will  be  able  to  organize 
more  computing  power  than  with  any  other  structure  and  to 
achieve  more  reliability.  The  issues  of  switches  and  links  are  so 
vital  that  through  understanding  of  them  all  computer  stnictures 
will  improve. 

References 

BhusA67;  DaviD67;  EverR57;  PlugW61;  RobeL67;  SegaRBl;  IBM 
System/.360  Attached  Support  Processor  (ASP)  System  Description, 
H20-0223-0 


Part  6 
Computer  families 

The  three  groups  or  families  of  computers  described  in  this  part  are  each  built  around 
a  single  ISP  and  PMS  structure.  The  IBM  701-7094  II  sequence  (Sec.  1)  shows  the 
evolution  of  a  series.  The  reader  can  trace  a  number  of  incremental  changes,  or 
features,  such  as  the  addition  of  index  registers,  indirect  addressing,  I/O  processors, 
and  larger  random-access  memories.  The  SDS  900-9000  series  and  the  IBM  Sys- 
tem/360 are  both  families  in  which  successor  models  are  within  a  planned  frame- 
work; evolution  occurs  mainly  in  the  implementations,  not  in  the  ISP. 


513 


Section  1 


The  IBM  701-7094  II  sequence, 
a  family  by  evolution 

The  IBM  701,  704,  709,  7090,  7040,  7044,  7094  I,  and  7094 
II  sequence  relationship  is  shown  m  Fig.  1.  The  group  is  not 
a  compatible  series.  The  IBM  701  [Astrahan  and  Rochester, 
1952;  Buchholz,  1953]  Is  a  forerunner  of  the  series;  all  except 
the  701  are  painfully  compatible.  The  sequence  is  included 
because  the  7090  is  a  reference  or  benchmark  of  scientific- 
computer  power.  All  machines  use  36-bit  words.  The  701  stores 
two  instructions/word  in  the  same  manner  as  the  IAS  computer 
(Chap.  4),  whereas  all  others  in  the  sequence  store  only  one 
instruction/word.  The  701,  704,  and  709  are  first-generation, 
vacuum-tube  technology;  the  rest  are  second-generation. 

The  IBM  7094  II  description  given  in  Chap.  41  is  based 
directly  on  information  in  the  Programming  Reference  Manual, 
but  the  Appendices  of  that  chapter  give  the  ISP  of  the  Pc,  a 
Pio,  and  a  K  as  inferred  by  the  authors  of  this  book.  The 
description  of  the  Pc  gives  the  instructions  in  the  704  and  7044 


7094 
709 
704 
701 


C(7094lf  C(7094n) 


..C(7090)" 


C(7040,7044)' 


1953        1955         1957      ,  1959        1961         1963  1965 
YeorHirst  delivery)^ 

'cdBM  701,  vacuum  tube,  36  b/w  ;  2  instruction/w ;  similar  to  IAS/ 

von  Neumann  C ,  program  controlled  I/O  data  transmission  \ 

Mp  {electrostatic,  30  )) 
^CdBM  704,  vacuum  tube,  36  b/w,  1  instruction  /w,  program  controlled 

I/O  ,  trapi ,  interrupts  ,^^ps  I I^C,  MO, '5  Index  Registers  Instruction 

Counter)  Mp  (core,  12  ^s)) 
^CdBM  709,  704  upward  compatible:  Pio  controlled  data  transmission) 
"CHBM  7090;  transistor,  Mplcore,  2  18 

^C(IBM  70941,  upward  7090  compatible ;  overlapped  memory , 

Mps(7  Index  Registers,  Pio  ('7909);  Mp(2ms)) 
'CdBM  7040,7044;  Pio,  program  controlled  I/O  ,  Mp(8,2^s)) 
'CdBM  70941,  Mpd  4  ^s)) 

^Adom's  Associates  Computer  Characteristics  Quarterly 


_T. console  - 


Mp!_  Pc=_Kio-Sf) 


.Tplne;  printer;   150  line/minn 
[_72/120  col/llne  J 

-Tfcard;  reader;  150  card/min;"|  - 
|_72/80  col /card  J 

-Tfcard;  punch;   100  card/mi  n;  "1  ■ 
[_72/80  col/card  J 


—  K — Sfx — Ms 


I —  K— Sfx — Ms 


*0;3;  drum;  t,a:cess:  0  ~.  80  ns 
1 .rate:  1200  us  w 
#0:3;  magnetic  tape;   1250  w/s; 
l^iOO  ft;  200  w/ft;  6  char/w: 
U  b/char 


Mp(electrostatic;   random;  lU  ^s/vi;  20')8  w;  36  b/w) 
^Pc{2  instructlons/w;  M. processor  state(—  3w) :   1  address/ 

instruction;  36  b/w;  technology:  vacuum  tubes;  descendants: 
IBM  70li,  IBM  709:  1953  ~  1956) 


Fig.  1.  Relationships  among  IBM  701,  704,  709,  7094  series. 


Fig.  2.  IBIM  701  PMS  diagram. 

to  show  an  evolution.  However,  the  major  evolutionary  change 
does  not  appear  in  Pc's  ISP  but  in  the  PMS  structure. 

The  704  structure,  like  that  of  the  701  (Fig.  2),  provides  only 
for  peripheral  transfers  to  primary  memory  via  Pc  under  pro- 
grammed control  with  no  interrupt  system.  As  such,  only  one 
T  or  Ms  could  operate  easily  at  a  time.  The  709  introduced 
the  Pio('Data  Channels)  to  improve  the  ability  to  transfer  data 
between  Mp  and  Ms  without  requiring  Pc  intervention.  Concur- 
rent operation  of  several  I/O  devices  is  carried  out  by  multiple 
Pio's  along  the  lines  of  the  7094  II  PMS  structure  (Fig.  1,  Chap. 
41,  page  518).  However,  the  utilization  of  the  data  channels 
tends  to  be  rather  low,  particularly  when  the  data  channel  is 
controlling  very  slow  devices  (e.g.,  card  equipment  and  line 
printers).  When  operating  a  high-speed  tape  unit  at  90,000  x  6 
bits/sec  the  utilization  of  the  data  channel  is  still  only  approxi- 
mately 3  percent.  A  program  interrupt  method  of  data  transfers 
would  have  been  sufficient. 

The  incompatibility  among  the  machines,  especially  the 
7090-7040-7094,  is  disheartening,  both  from  the  point  of  view 
of  a  user  and  an  engineer.  The  incremental  hardware  needed 


515 


Part  6  I  Computer  families 


Section  1     The  IBM  701-7094  II  sequence,  a  family  by  evolution 


to  achieve  compatibility  is  inexpensive  when  the  system  price 
is  considered.  Also,  the  incremental  changes  in  the  ISP  do  little 
to  increase  the  Pc  performance.  Compared  with  the  704,  the 
extensive  order  code  of  the  7094  shows  an  evolution  in  which 
for  marketing,  emotional,  or  analytic  reasons  new  instructions 
were  added.  The  index  registers  and  their  instructions  are  a 
good  example  of  this  trend.  The  7094  has  a  very  general  set 
of  Index-register  transmission  instructions;  if  implemented 
properly,  they  are  probably  easier  to  provide  than  the  original 
704  instructions. 

In  the  implementation  of  the  double-precision  floating-point 
hardware,  the  sense-indicator  register  is  needed  for  temporary 


storage.  Thus  a  user  has  to  preserve  this  register  when  double- 
precision  floating-point  instructions  are  given.  The  reason  for 
this  undoubtedly  relates  to  field  modifications  and  cost.  In  an 
original  design  this  would  be  inexcusable;  in  this  case  double- 
precision  floating  point  is  undoubtedly  worth  the  loss  of  sense 
indicators. 

All  in  all,  the  designers  of  the  704-7094  1 1  provided  increased 
generality  through  evolution.  They  gradually  ran  out  of  patching 
time,  technology,  instruction  encoding  space,  and  memory 
addressing  bits,  while  exceeding  compatibility  constraints.  It 
was  indeed  time  to  create  the  IBM  System/360. 


Chapter  41 


The  IBM  7094  I,  II 

Introduction 

The  IBM  7094  I  and  7094  II  computers  are  the  last  of  a  series 
of  foniputers  beginning  with  the  IBM  704  (Fig.  1,  page  515).  The 
series  is  an  outgrowth  of  the  IBM  701.  Although  the  series  is 
designed  for  scientific  (arithmetic)  calculations,  its  speed  and 
stincture  allow  it  to  be  used  for  general-purpose  computation. 
Bvisiness-type  processing  which  uses  string  data  is  efflcientlv  han- 
dled by  conversion  into  fixed-length  fields  at  input  and  output. 
From  about  1956  to  1966  the  family  was  the  standard  of  large 
computers  in  the  United  States,  there  being  appro.ximatelv  20  701, 
,50  704,  20  709,  50  7090,  1.30  7094  I,  125  7094  II,  120  7040,  and 
120  7044  computers  in  existence. 

The  PMS  structure  is  a  single  central  processor  (Pc)  with 
multiple  input/output  processors  (Pio's)  (for  all  except  the  701  and 
704).  The  Pio's  provide  for  multiple  transfers  to  primary  memory 
(Mp)  at  high  information  flow  rates.  The  structure  allows  for 
duplex  connection  to  terminal  (T)  or  secondary-memory  (Ms) 
control  (K).  This  provision  permits  the  svsteni  to  be  used  in  real- 
time applications  requiring  significant  computation,  high-data-rate 
transfers  with  other  systems,  and  high  availability.  However,  the 
system  was  not  initially  designed  for  time  sharing  and  multipro- 
gramming use,  and  the  attempt  to  so  use  it  required  modification 
[Corbato  et  al.,  1962]. 

The  word  length  is  .36  bits.  There  is  one  single-address  instruc- 
tion/word. In  all  but  the  7094  the  processor  interprets  instructions 
serially.  In  the  7094  one  register  instruction  look-ahead  is  used. 
The  Pc  has  index  registers,  the  704  being  the  first  IBM  computer 
to  use  them.  Their  number  increased  from  three  in  the  704  ^  7090 
to  .seven  in  the  7094,  as  their  usefidness  became  apparent. 

Structure 

A  simple  tree-structured  IBM  7094  I  using  PMS  is  shown  in  Fig. 
1  and  using  a  conventional  block  diagram  in  Fig.  2. 

Primary  memory  (Mp)  and  P-Mp  switch 

The  primary  memory,  Mp('7302  Core  Storage),  has  a  capacity  of 
32,768  36-bit  words  with  a  cycle  time  of  2  microseconds.  The 
actual  memory  has  a  72  -I-  1  parity  bit  word  for  even  and  odd 
addresses  of  36-bit  words.  A  request  for  two  36-bit  words  can  be 


acknowledged  in  one  2-microsecond  memory  cycle.  Thus  Mp  is 
Mp(  '7302  Core  Storage;  2  ;us/w;  16.384  w;  (72,  I  parity)  b/w)  for  the 
7094  I,  and  M p(  1 .4  jas/ w;  16384  w;  ( 72, 1  parity)  b/w)  for  the  7094  II. 

The  S('7606  Multiplexor:  time  multiplexed)  provides  access  to 
.Mp  from  any  one  of  nine  P's.  Only  Pc  can  request  two  36-bit  words 
at  a  time  from  Mp  for  instruction  look-ahead  and  double-word 
operations.  There  can  be  only  one  Pc  in  the  system. 

Processors,  P 

Three  processors  are  described;  Pc('7109,  7110  Central  Processing 
Unit  CPU  I,  Pioi'7607  Data  Channel),  and  Pio('7909  Data  Chan- 
nel). 

All  P's  behave  similarly  in  that  Pc  instructions  and  Pio  com- 
mands' are  fetched  (or  requested)  from  Mp  and  then  interpreted 
in  P.  An  instruction  location  counter  in  P  addresses  the  next 
instruction.  A  processor  instniction  may,  in  turn,  require  the 
processor  to  access  Mp  for  data,  to  perform  transfers,  to  modify 
its  state,  etc.  .\lthough  structurally  the  P's  are  similar,  organiza- 
tionally the  Pc  is  superior  to  the  Pio('Data  Channel)'s;  Pc  issues 
programs  to  Pio's  and  start  and  stops  (controls)  Pio's. 

Two-way  communication  is  required  between  Pc  and  the  Pio's. 
Tasks  (jobs  or  programs)  for  Pio's  are  first  set  up  in  Mp  by  Pc. 
Pc  then  demands  that  Pio  execute  the  program  independently 
under  its  own  control.  Initialization  takes  place  when  Pc  sets  the 
instruction  counter  of  a  Pio.  Upon  task  completion  in  Pio,  an 
interrupt  request  is  sent  to  Pc  from  Pio. 

Below  we  first  give  a  description  of  the  Pc.  Then  the  Pio('7909) 
is  presented  in  detail  and  the  Pio('7607)  is  outlined.  The  reader 
should  compare  the  two  Pio's.  The  Pio('7909)  is  a  later  design  than 
the  Pio{'76()7).  It  interprets  instnictions  for  the  block  of  data  being 
transferred  and  issues  instmctions  to  the  KMs  or  KT.  The  earlier 
Pio('7607)  interprets  the  instructions  for  controlling  the  informa- 
tion being  transferred;  the  Pc  interprets  and  issues  the  instructions 
to  KMs  or  KT.  The  7909  is  therefore  able  to  control  more  closely 
a  T  or  Ms  using  a  single  program  without  need  for  Pc  intervention. 

UBM  attempts  to  distinguish  between  Pc  and  Pio's  tenninologically  by 
"instniction"  and  "command."  We  make  no  such  distinction  in  the  follow- 
ing discussion;  P's  interpret  instructions. 


517 


518  Part  6  I  Computer  families 


Section  1  |  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


T('715l-2;  console)- 


3:9:  '72<i  I  ~  VI 
tape;  75ln2  in/s 
200,  556,  SOO  by/ 


magnet i c 
2400  ft : 
T.   6  b/by 


'716;   line;  printer:  72/120 
char/line;    150   In/min;  61| 
symbol/char;  6  b/char 

~721 ;  card;  punch;  72/RO 
col/card;   120  card/min 

71  1  •  card;   reader-  72/fiO 
col/card;  250  card/min; 
6,   12  b/col 

#0  :  4 ;    ( ' 1 301  ;   movi  nq 
head  disk;   56megachar;  156 
kchar/s) | ('2320;  drum: 
830  kchar;   135  kchar/s): 
6  b/char 

#0:9;   '73'iO  Hypertape; 
addressable  magnetic 
tape:   1800  ft;   112  in/s; 
Boo  char/in:   (8,2  check) 
b/char 

M5(#0:9;    '73'40  Hypertape)- 


Kin  :6)— Sf  X 


-T(//0:9)°- 


L(to:Pio(/!''(:8)) 


^Mp(core;   163814  w;   (72,  1  parity)  b/w;  2.0p.s/w) 

^S(time  multiplexed;    '7606  Multiplexor;   1  M;  9  P;   radial;   location;  central) 
^Pc ( ' 7109 ,71 1 0  Central   Processinq  Unit;   1    i ns t ruct i on/w :   1  address/ i ns t ruct i on ; 

Mps(I2w);  data;  s i ,bv , sf , suf ,df , duf , f r . i :  technology;  transistors:  antecedants:   IBM  70^,709, 

7090;  descenddntb  ;  I  BM  /OllO  ,  70'4'4 , 709'l   II;   1962  -~  1966) 
*S(fixed;   from:   2  K;   to:   5  Ms i  concur rency ; 2) 
''kC#1,2;  '761(0) 

®T   ;=   (TdOlli  paper  tape;   reader;  500  char/s;  6,7,8  b/char )  |  T  (' 1 0 1  ^ ;  paper  tape; 

punch)  1  T (Teletype) I T (typewr i  ter) ) 
'S  (fixed:  2  K) 


Fig.  1.  IBM  7094  I  PMS  diagram. 


Chapter  41  \  The  IBM  7094  1,11  519 


7151-2 
Console 


7302 
Core  Storoge 


7606 
^AultIplexor 


71 10 

'  7109 

Instruction 

'  Arithmetic 

Processing 

'  Sequence 

Unit 

1  Unit 

(Central  Proc 

essing  Unit) 
1 

716 
Printer 


711  Cord 
Reader 


721  Cord 
Punch 


7607  Data 

Chonnel 

7909  Data 
Channel 

729 

Tope  Units 

7631  File 
Control 

7320 

Drum  Storage 

1301  Disk 
Storage 

7909  Dato  Chonnel 
O     (channel  switch)  O 


7909  Doto  Channel 
®     (channel  switch) 


1414  -  6  I/O 
Synchronizer 

7640 

Hypertape 
Control 

 r 


1009  DTU 
101 1  PTU 
tOU  RIU 
Telegraph 
I/O  Units 


5 


7340 

Hypertape 
Drives 


7340 

Hypertape 
Drives 


7631 

File 

Control 

1301  Disl< 

Storoc 

e 

Fig.  2.  IBM  7094  data-processing  system  configuration.  (Courtesy  of  International  Business  Machines  Corporation.) 


Central  processing  unit,  Pc 

The  Pc  has  three  physical  parts:  the  T('7151-2  Console),  the 
D('711()  Instruction  Processing  Unit),  and  the  K('7109  Arithmetic 
Sequence  Unit).  In  terms  of  gross  PMS  parameters  the  7094  I  can 
be  described  similar  to  footnote  three  of  Fig.  1  as 


Pc 


'7109,  7110,  71.51-2  Central  Processing  Unit/CPU; 
.36  b/w;  1  address;  1  instruction/w; 
data:  (si,  bv,  sf,  suf,  df,  duf,  fr.i); 
number  representation:  sign,  magnitude; 
Mps('.\ccumulator,  'MultipIier„Quotient,  'Index^Registers 

[1:7],  'Sense^Indicators,  'Instruction^Counter,  'Trap^En- 

able<  1 : 12),  'miscellaneous„bits<  1 : 7)) 


The  Pc  will  be  discussed  in  two  parts:  the  Register-transfer 
level  implementation  and  the  Instruction-set  Processor.  These  are 
partially  redundant,  but  they  offer  another  opportunity  to  compare 
the  two  types  of  descriptions.  The  Pc  hardware  will  be  described 
by  first  giving  the  registers  and  the  interregister  transfer  paths. 
Then  the  process  bv  which  instructions  are  interpreted  will  be 
described.  (Interpretation  occurs  in  a  distinct  set  of  memory  cycles, 
called  instruction/I,  execute/E,  logic/  L,  and  buffer/B,  which  are 
sometimes  mentioned  in  describing  registers  and  will  be  fully 
discussed  later.) 


Part  6  I  Computer  families 


Section  1  I  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


Processor  registers  and  mode  bits  registers 

Figure  3  gives  the  Pc  registers  and  the  data  transfer  paths.  Both 
the  ISP  registers  (denoted  by  °)  and  the  temporary  registers  are 
given.  The  ISP  registers  and  modes  are  controlled  by  the  program. 

Instruction  counter  (/C)°.  The  Instruction  Counter,  IC,  is  15  bits. 
It  is  used  by  the  processor  to  locate  the  next  instruction  in  Mp. 
Once  the  program  is  started,  the  IC  can  be  set  to  an  address 
specified  by  a  transfer  instruction.  For  most  instructions,  the  IC 
is  stepped  sequentially  by  1  with  each  new  instruction.  The  IC 
is  normally  advanced  at  the  end  of  each  instruction  (I  cycle). 

Instruction  backup  register  (IBR).  The  Instniction  Backup  Register, 
IBR,  is  a  .36-bit  register,  (S,  1:3.5),  and  is  used  to  buffer  the  next 
instruction.  Pc  attempts  to  have  the  next  instruction  available  in 
IBR,  since  the  Mp  permits  72-bit  transfers,  thus  avoiding  an 
unnecessary  reference  to  Mp.  When  the  instruction  reference  is 
to  an  even  location,  the  IBR  is  loaded  with  the  contents  of  the 
next  higher  odd  address  after  the  contents  of  the  even  address  have 
been  placed  in  the  Storage  Register.  The  IBR  is  also  used  for 
fetching  operands  in  double-precision  operations. 

Address  register  (AR).  The  Address  Register,  AR,  is  15  bits  and  re- 
ceives information  from  the  Storage  Register,  Instmction  Backup 
Register  (at  the  beginning  of  a  storage  reference  I  or  E  cycle). 
Index  Register,  and  Index  .'Vdder.  The  contents  of  the  AR  are 
sent  to  the  Multiplexor  Address  Switch  to  select  the  core  mem- 
ory location. 

Instruction  register  (IR).  The  18-bit  Instruction  Register,  IR,  is 
divided  into  two  parts:  bits  <S,  1:9)  always  contain  the  operation 
part  of  the  instruction,  and  bits  00:17)  form  the  Shift-counter 
Register.  The  Shift  Counter  is  used  during  shifting,  multiplication, 
division,  and  floating-point  instructions.  Bits  <10:17)  may  also 
contain  a  sense  instruction  address,  operation  codes  for  those 
instructions  which  require  an  address  part,  and  the  class  and  unit 
codes  for  input/output  instructions. 

Storage  register  {SR).  The  .36-bit  Storage  Register,  SR,  stores  infor- 
mation that  comes  from  or  goes  to  core  storage. 

Adders  (not  a  register).  The  Adders  furnish  a  36-bit  path  for  data 
going  from  the  storage  register  to  other  registers  in  the  processor. 

Accumulator  register  (AC)°.  The  Accumulator  Register,  AC,  is  38 
bits  (a  .35-bit  word  with  a  1-bit  sign,  and  2  bits  for  overflow 


conditions,  P  and  Q).  The  AC  is  used  to  hold  one  factor  during 
arithmetic  or  logical  operations  and  to  receive  results  from  the 
adders. 

Information  may  be  shifted  into  the  accumulator  from  the  MQ, 
1  bit  at  a  time. 

Multiplier-quotient  register  {MQ)° .  The  MQ  Register  is  36  bits. 
During  a  multiply  instruction,  MQ  contains  the  multiplier;  during 
a  divide  instruction,  MQ  receives  the  quotient.  It  can  be  shifted 
right  or  left,  independently,  or  combined  with  AC  into  a  72-bit 
register. 

Sense  indicator  register  (S/)°.  The  Sense  Indicator  Register,  SI,  is  36 
bits.  SI  is  normally  used  as  a  set  of  binary  program  switches  which 
can  be  set  and  tested.  However,  it  is  also  used  as  a  temporary  register 
in  double-precision  arithmetic  operations. 

Index  registers  {XR)° .  Seven  15-bit  Index  Registers,  XRs,  in  the  7094 
system  are  used  for  address  modification.  Thev  are  specified  by  the 
tag  bits  of  an  instruction  (bits  (18:20))  and  modify  an  address  bv 
adding  the  two's  complement  of  their  contents  to  the  address.  In  the 
earlier  7090  (and  7044)  only  XR[1,  2,  4]  are  available. 

Midtiple  tag  mode' .  In  Multiple  Tag  Mode  only  Index  Registers 
1,  2,  and  4  can  be  specified.  The  indexing  fvmction  specified  is 
determined  by  the  "logical-or  "  of  each  index  register  specified. 
When  not  in  Multiple  Tag  Mode,  each  3-bit  number  selects  one 
of  seven  index  registers.  The  1-bit  Multiple-Tag-Mode  Register 
maintains  the  state  of  the  mode.  The  requirement  for  the  two 
modes  comes  entirely  from  the  need  to  maintain  compatibility 
between  the  704,  709,  7090,  7040,  and  7044  (which  have  three 
index  registers  addressed  as  in  Multiple  Tag  Mode)  and  the  7094 
I  and  7094  11  which  have  seven  index  registers. 

Tag  register  (TR).  This  temporary  register  holds  the  tag  field  of 
the  instruction  being  executed  and  is  used  to  select  the  Index 
Register  being  addressed. 

Index  adders  (XAD)  (not  a  register).  A  separate  15-position  Index 
Adder  is  used  for  the  Index-register  operations.  All  storing,  load- 
ing, changing,  and  modifying  of  Index  Registers  is  via  the  Index 
Adders. 

Accumulator  overflow'.  The  Accumulator  Overflow  Indicator  is 
turned  on  whenever  a  1  passes  into  or  through  position  P  from 
position  1  of  the  AC  as  a  result  of  the  execution  of  a  fixed-point 
arithmetic  or  a  shifting  instruction. 


Chapter  41  |  The  IBM  7094  I. II  521 


CONSOLE 

sense  1  I qnts  , 
swi tches 


Operator  Panel  Keys 


InjtTuctioi 

n  Register 

Shift  Ctr 

5,1  ' 

10  17 

Operation  Sense 
Decode  Decode 


Index  Register! 


17 


CPU 


Index 

Adder. 

3  

  17 

CompI 


3-17 


I  MULTIPLEXOR 


Even 


Odd 


Multiplexor  Storage  Bus 


Stproge  Reaii^er 


s,i 


Instruction  Counter 


-7 


Address 

Register 

3   

 17 

Miscellaneous  Mode 


Adders 

Q.P,  1  8 

9  

—  35 

34-35 


S.Q.P.l—  8 


Tog  Register 


18  • 


20 


Instruction  Bockup  Register 


35 


18-20 


t>-35 


1 


Sense  Indicotors 


0  17|l8^ 


\  Left  I  Right 


5,1,9 


5,1-5 


5t_ 


(DFAD)  35 


Multiplexor 
Address  Switch 


Odd  Cofe 
Addresses 


Odd  5,1-35 


Even  Core 
Addresses 


Even  S,l-35 


CORE  STORAGE 


MULTIPLEXOR 


*Avallable  to  the  Instruction  Set  Processor 


Fig.  3.  IBM  7094  central-processing-unit  information  flow.  (Courtesy  of  /nternationa/  Business  Machines  Corporation.) 


Part  6  I  Computer  families 


Divide-check° .  The  Divide-Check  Indicator  is  turned  on,  in  fixed- 
point  or  floating-point  division,  if  the  magnitude  of  the  number 
in  the  AC  (dividend)  is  greater  than  or  equal  to  the  magnitude 
of  the  number  in  memory  (divisor). 

Input-output  check'.  The  Input-Output  Cheek  Indicator  (I-O 
check)  is  turned  on  by  the  attempted  execution  of  an  input/output 
instruction  without  first  selecting  an  input/output  unit. 

Transfer  trap  mode' .  The  computer  can  be  operated  in  a  special 
Transfer  Trap  Mode.  Operation  in  the  Trap  Mode  permits  the 
program  to  run  at  normal  speed  with  interruptions  of  normal 
operation  only  at  transfer  points.  At  such  points  the  location  of 
the  last  sequential  instniction  is  saved,  and  a  transfer  of  control 
is  made  to  a  fixed  location. 

Sense  switches'.  Six  Sense  Switches  are  located  on  the  console. 
They  may  be  turned  on  or  off  manually,  and  there  are  instructions 
which  sense  them. 

Sense  lights'.  Four  Sense  Lights  are  also  on  the  console.  Any  one 
of  these  lights  may  be  turned  on,  off,  or  the  status  tested  by 
instructions. 

Panel  in-out  switches' .  These  .36  switches  on  the  console  may  be 
read  by  an  instruction. 

Instruction-set  ititerpretation 

The  basic  computer  clock  cycle  is  2.0  ^s  in  7094  I  and  1.4  fis  in 
7094  II,  as  dictated  by  Mp.  Within  the  single  2-  (or  1.4-)  micro- 
second cycle,  up  to  10  sequential  register  transfers  and/or  data 
operations  can  take  place,  each  of  which  transfers  information 
among  the  Pc's  registers;  several  operations  may  occur  simulta- 
neously. In  Pc  four  different  cycles  are  used:  instruction/I,  exe- 
cute/E,  logic/L,  and  buffer/B.  The  cyclic  sequence  of  an  instruc- 
tion is  fixed,  always  beginning  with  an  I  cycle  and  progressing  to  E, 
L,  or  B  cvcles,  depending  on  the  instruction.  The  nvunber  of  cycles 
required  for  an  instruction  may  vary  from  1  (e.g.,  transfer)  to  19  (e.g., 
double-precision  floating-point  divide). 

Instruetiini  ei/cle  (/).  The  I  cvcle  begins  when  IC  huiiishes  the 
instruction  location  to  Mp,  via  S('Multiplexor).  The  addres.sed 
instruction  word  taken  from  Mp  goes  to  the  Multiplexor  Storage 
Bus  (Fig.  3).  From  the  Multiple,xor  Storage  Bus  the  instniction 
is  read  into  the  Storage  Register  where  it  is  separated  into  the 
operation  portion  and  the  address  portion  of  the  instruction  word. 


Section  1  I  The  IBIVl  701-7094  II  sequence,  a  family  by  evolution 


The  operation  portion  of  the  Storage  Register  goes  into  the  In- 
stniction Register,  where  the  operation  code  is  decoded  and  the 
execute  control  circuitry  is  set  up  to  perform  the  operation 
specified  by  the  instruction.  The  address  portion  of  the  instnic- 
tion word,  now  located  in  the  Storage  Register,  may  be  used 
directly.  Normally,  however,  it  goes  to  the  Address  Register  and 
then  to  the  Multiplexor  Address  Switch  to  locate  the  appropriate 
data  word  in  Mp.  If  the  address  is  to  be  modified,  it  is  routed 
from  the  Storage  Register  to  the  Index  Adders  for  Index-register 
modification.  The  modified  address  is  then  brought  to  the  Address 
Register  and  on  to  the  Multiplexor  Address  Switch  to  locate  the 
data  word  in  core  storage. 

Concurrently,  during  the  same  instruction  cycle,  a  second 
instniction,  located  at  the  immediately  higher  odd-numbered  Mp 
address  location,  is  brought  to  the  Instruction  Backup  Register/ 
IBR.  While  in  the  IBR,  the  odd-numbered  instniction  is  partially 
decoded  to  determine  if  it  meets  certain  criteria  for  concurrent 
execution,  thus  saving  a  second  Mp  reference.  If  the  instruction 
in  the  IBR  cannot  be  executed  with  the  current  instniction,  it  is 
ignored  in  the  current  I  cycle  and  is  brought  into  the  Storage 
Register  on  the  next  I  cycle. 

Execution  cijcle(E).  The  execution  (E)  cvcle  is  used  when  a  reference 
to  core  storage  is  needed.  All  instnictions  requiring  an  operand  have 
an  E  cvcle  following  the  I  cycle. 

Indirect  addressing  of  an  instniction  requires  an  extra  E  cycle. 
In  other  words,  an  instniction  that  normally  goes  from  I  to  E  to 
be  executed  will  go  to  I,  E,  and  again  to  E  if  it  is  indirectly 
addressed. 

Logic  ci/cle  (L).  The  L  cycle  is  an  execute  cycle  that  does  not 
require  a  reference  to  Mp.  Many  instructions  use  both  E  and  L 
cycles  when  information  is  required  from  storage  and  the  instruc- 
tion cannot  be  completed  during  an  E  cycle.  Other  instructions 
require  no  reference  to  storage  and,  therefore,  use  only  I  and  L 
cycles  for  their  completion. 

Buffer  cycle  (B).  A  buffer  (B)  cycle  is  a  null  Pc  cycle;  it  is  used 
when  the  data  channels  get  information  from  or  put  information 
into  core  storage.  This  information  can  be  either  data  or  data- 
channel  commands.  All  demands  for  B  cycles  come  from  the 
channels  themselves.  Because  of  the  nature  of  Ms"s  and  T's,  the 
demand  for  a  B  cycle  takes  precedence  over  an  instniction  being 
performed  by  Pc.  If  Pc  is  in  its  logic  cycle,  then  both  an  L  and 
B  cycle  occur  simultaneously. 


Chapter  41  [  The  IBM  7094  1,11  523 


Instruction  interpretation.  Instruction  flow  diagrams  for  the  CLA, 
CAL,  and  CLS  instructions  are  given  in  Fig.  4.  These  diagrams 
show  the  sequential  process  of  instruction  execution.  Although  the 
flow  diagrams  for  these  instructions  are  trivial,  the  general  process 
is  still  apparent.  The  more  complex  instnictions,  for  example,  dou- 
ble-precision floating-point  divide,  are  carried  out  in  a  similar 
fashion,  but  with  many  more  operations.  The  registers,  transfer 
paths,  and  interregister  data  operations  are  the  register-trans- 
fer-level primitives  from  which  the  ISP  is  implemented.  The  data 
flow  diagram  (Fig.  .3)  explicitly  defines  the  main  registers  and 
register  operations  within  Pc. 

Pc  ISP 

The  Pc  Instruction-set  Processor  is  given  in  Appendix  1  of  this 
chapter.  The  instructions  are  arranged  in  groups  according  to  the 
location  of  operands.  These  groups  are: 

Operations  on  Mp 

Mp  <— u  Mp  {unary  operation/u  on  Mp) 

Mp<— u  Mps  {unary  operation  on  ^processor 

state/Mps) 

Mp  <—  Mp  b  Mps       {binary  operation/b) 


Operations  on  AC  and  MQ 
Mps  «—  u  Mps 
Mps  <—  u  Mp 
Mps  «—  Mps  b  Mp 

Operations  on  the  index  registers 

Operations  on  the  sense  indicators 

Instruction  for  program  control 

Memory  mapping  for  multiprogramming  and  Mp{65536  u>) 

A  special  option  provides  multiprogramming  by  allowing  a  pro- 
gram to  run  in  a  protected  area  of  Mp.  Two  registers  are  used: 
The  base  register  establishes  the  lower  bound  of  the  program,  and 
the  length  register  establishes  the  upper  bound.  Pc  checks  that 
all  program  references  are  within  the  protected  area. 

Two  Mp(32678  w)'s  can  be  used  on  the  computer.  Mp  is  then 
considered  as  A  core  and  B  core  for  addresses  0:.32767  and 
32768:65535.  A  1-bit  register  is  used  to  select  whether  A  or  B  core 
is  to  be  used  for  data;  and  one  I-bit  register  is  used  to  select 
whether  A  or  B  core  is  to  be  used  for  the  instruction.  These 
modifications  were  used  at  M.I.T.  in  their  Compatible  Time  Shar- 
ing System/CTSS  [Corbato  et  al.,  1962]  which  used  a  7094  II. 


Obtain  insftuction 
from  storage 


Instruction  placed 


1 


Operofion  code 
placed  in  rnst  reg 

Address  routed  through 
address  register  3-17 

Operation  decoded 
In  decoders 

Address  of  doto  is 
located 

Bring  up  execution 
control  lines 

CLA,  CAL 

Data  routed  to  the 
storage  register 

■  CLS 

CAL  ^ 

Minus  to  storage 
register  sign 

SR  sign  to  ^ 
odder  P 

Data  routed  through 
adders  to  accumulator 

Fig.  4.  IBM  7094  CLA  and  CLS  instruction  flowcharts.  (Courtesy  of 
/nternationa/  Business  Machines  Corporation.) 


Pio('7607  Data  Channel) 

The  Pio('76()7  Data  Channel)  executes  programs  which  transfer 
data  between  Mp  and  .Ms(magnetic  tape)  or  T(card;  reader, 
punch),  (line;  printer)).  The  paths  and  structure  can  be  seen  in 
Fig.  1. 

Transferring  blocks  of  data  between  Mp  and  an  Ms  or  a  T  via 
the  7607  data  channel  takes  places  as  follows: 

1  Pc  sets  up  the  block  transfer  program  in  Mp  for  Pio. 

2  Pc  attaches  a  K  for  Ms(  magnetic  tape)  or  for  T(card;reader) 
to  Pio.  ( Faults  in  the  connection  niav  cause  K  to  interrupt  Pc.) 

3  Pc  starts  the  Pio  by  loading  the  Pio's  instrviction  counter. 

4  The  data  transmission  takes  place.  On  input,  for  example, 
T  or  Ms  transmits  a  6-bit  character  (or  a  72-bit  word)  to 
K.  The  characters  are  buflPered  (collected)  in  K  and  sent  on 
to  Pio.  Pio  then  requests  a  memorv  access  from  Mp  via  the 
S('7606  Multiplexor)  and,  finally,  a  data  word  is  transmitted 
to  Mp. 

5  At  the  termination  of  a  simple  data  block  transfer,  Pio 
fetches  the  next  instruction  from  Mp.  If  the  next  instruc- 
tion-task tvpe  is  the  same,  Pio  and  K  remain  logically  linked 
and  continue  to  transmit  data. 


524  Part  6  |  Computer  families 


Section  1     The  IBM  701  7094  II  sequence,  a  family  by  evolution 


6    At  the  termination  of  the  task,  the  completion  signal  from 
Pio  causes  Pc  to  interrupt  and  Pio  may  also  halt. 

PioCIBM  7909  Data  Channel) 

Ms('13(;I  Disk  Storage,  '7340  Hypertape  Drives)  and  the  T('Tele- 
Processing  equipment)  communicate  with  Mp  via  the  Pio('7909 
Data  Channel).  Four  7909  Data  Channels  may  be  attached  to  a 
7094  I  or  II  system. 

K('7631  File  Control)  is  required  for  M(disks).  Several  K('763I) 
can  be  used  with  the  7094  system  alone  or  shared  with  an  IBM 
1410  system  or  shared  with  another  IBM  7000  series  (not  7072 
system). 

When  Ms('7340  Hypertape  Drives)  are  attached  to  the  7094 
system,  K('7640  Hvpertape  Control)  is  used  between  the  7909  data 
channel  and  the  drives.  One  K('7640)  may  be  attached  to  a  7094 
system;  it  has  two  paths,  each  of  which  can  be  used  for  data 
transmission. 

The  K('I416-6  Input-Output  Synchronizer)  Is  used  with  T('Tele- 
processing  Equipment)'s.  The  structure  for  these  T's  is  rather 
elaborate,  yet  only  six  T's  can  be  active  at  a  time. 

Transferring  data  from  Mp  to  a  T  or  an  Ms  via  the  7909  takes 
place  as  follows: 

I    Pc  sets  up  the  data-transfer  management  program  in  Mp  for 
a  Pio. 


in  ISP  descriptions  (Appendices  2,  3  and  4  of  this  chapter).  The 
main  registers  of  Pio  are  shown  in  Fig.  5.  These  registers  are 
declared  and  their  fimction  is  explained  in  the  first  section  of  the 
ISP  description  of  Pio  (Appendix  2).  The  remainder  of  the  ISP 
description  is  concerned  with  defining  the  interpreter  and  the  ISP 
instniction  set. 

There  are  about  50  bits  in  the  K's  (see  Appendix  3).  A  knowl- 
edge of  K's  state  and  the  K  process  is  required  for  understanding 
the  Pio.  A  description  of  the  K  and  Pio  data-transmission  processes 
is  given  in  Appendix  2. 

The  Pc  instructions  controlling  Pio  are  presented  in  Ap- 
pendix 4. 

The  level  of  detail  in  the  appendices  is  slightly  greater  than 
that  in  normal  ISP  description.  It  is,  however,  not  completely 
precise,  as  the  behavior  is  extremely  time-  and  Ms-  or  T-depend- 
ent.  The  sequence  check  conditions  are  incomplete:  that  is,  the 


Mp  core) 
I 

S(  7606  Multiplexor) 


(36  doto) 


Storage 

bus 
switches 


Doto 
register/ 
DR<S,1  35> 


Assembly 
register/ 
AR<S,1:35> 


(6) 


Operation 
register/ 
0R<0  4>  F 


Word 
counter/ 
WC<3;17> 


Control 
counter / 
CFC<0:5> 


(15  address) 


Chonnel 
address 
switches 


"1  (15)1  (15) 


Address 
counter / 
AC<21:35> 


Command 
counter / 
CC<21:35> 


Chorocter 

switches, 
register 


"fl 


Assembly 
ring/ 
ASR, 


Chorocter 
ring/ 
CR 


Pio  stofus 
bits 


(6) 


K(disk|  'Hypertape) 


Control 

doto 
register/ 
K^dQta<0:5>l 


K  status 
bits 


Ms(disk|  Hypertape) 


2  Pc  starts  Pio  by  setting  Pio's  command  (in.struction)  location 
counter  at  the  origin  of  the  task  program  in  Mp.  (Faults  in 
the  connection  may  cause  Pio  interrupts  to  Pc.) 

3  Pio  issues  an  instruction  to  be  executed  by  K.  This  establishes 
a  state  in  K  which  selects  and  initializes  the  particular  Ms  or 
T  and  attaches  the  peripheral  device  K  to  Pio.  (Faults  in  this 
selection  may  cause  interruption  of  Pio.) 

4  The  data-transmission  instruction  is  read  and  initializes  Pio. 

5  The  data  transmission  takes  place  under  control  of  Pio-K. 
The  K  of  the  selected  device  assembles  characters.  Input 
characters  are  transferred  to  Pio  which  assembles  them  into 
words  and  in  turn  transfers  them  to  Mp. 

6  At  the  termination  of  a  data  block  transfer  instruction, 
another  instruction  is  fetched  from  Mp  by  Pio.  This  in- 
stiTjction  may  be  to  another  K. 

7  At  the  termination  of  the  Pio  program,  Pio  signals  comple- 
tion by  interrupting  Pc. 


This  discussion  is  based  on  information  taken  from  the  IBM 
7094  Reference  Manual.  The  body  of  the  description  is  contained 


Fig.  5.  IBM  7909  data-channel-registers  diagram. 


Chapter  41  j  The  IBM  7094  1,11  525 


conditions  for  illegal  instruction  sequences  are  not  given.  Both  ISP 
and  text  descriptions  are  given  for  parts  which  are  particularly 
complex. 

The  ISP  description  should  be  observed  in  the  following  se- 
quence: Pio  State;  K  State  (Appendix  3);  Pio  Instniction  Format; 
Fio  Interpreter;  Pio  Instruction — Control  (or  Initialization)  in- 
structions. Block  Transfer  (or  Copy)  instructions.  Conventional 
Move  and  Transfer  instructions,  and  Interrupt  Control  instructions; 
Instructions  in  Pc  (Appendix  4);  Interrupt  Operation;  and  Proc- 
esses defining  data  movements  between  K  and  Pio  (.Appendix  2).  The 
Pio,  K,  and  Ms  or  T  processes  are,  in  several  ways,  more  complex 
than  those  of  a  Pc.  First,  Ms  or  T  activity  is  not  categorized  as 


nicelv  as  a  Pc  instruction  set.  The  T  or  Ms  events  occur  at  times 
peculiar  to  the  device — not  a  simple  synchronous  clock.  FinalK ,  the 
peripheral  components  have  a  large  number  of  error  states. 

Conclusions 

The  series  ending  w  ith  the  IBM  T094  II  is  a  significant  member  of 
the  computer  population.  It  pro\  ides  a  good  e.vample  of  the  evolu- 
tion in  computer  systems  that  occurred  from  19.54  to  196.5. 

References 

CorbF62;  FrizC.5.3;  GreeJ57;  GrumM.58;  Rossi  153:  Saxojftl;  .StevL52:  .\22- 
6703  IBM  7094  Principles  of  Operation 


526  Part  6  |  Computer  families 


Section  1  |  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


APPENDIX  1    IBM  7094  PC  ISP 


Apipend  1 

X  1 

IBM  mk  Pc  ISP 

Descr I ot  i  on 

Fa  State 

The  description  does  not  include  the  two  protection  and  relocation  schemes  used  for  the  7040  and  7094,     The  Trap-Mode  flip-flop 
is  declared;  its  action  is  not  described.    Trap'-^ode  allows  any  change  of  the  Instruction  Counter  to  cause  a  trap.    The  Instruc- 
tion Backup  Register  is  not  described^  although  it  is  used  to  save  time  in  program  execution.     The  description  of  the  arithmetic 
functions  is  highly  simplified. 

ACO.P.S,  1  :35> 

*  Aaavimlator,  38  bits 

ACs<S,l :35>     :=  AC<S, 1 : 35> 

*  signed  AC  word 

AC1<P,1 :35>     :=  AC<P,1 : 35> 

^  logical  AC  word 

P   :=  AC<P> 

*  carry  for  AC<1:3S>;  AC  overflow  is  also  set 

5  :=  AC<5> 

*  aarru  for  bits<Pjl :  3S> 

S   :=  AC<S> 

*  sign  bit  of  AC 

MQ<S , 1 : 35> 

^  Multiplier-Quotient 

ACMQ<S,(J,P,t  :?!>:=  AC0M(}<1:35> 

*  double  word  accumulator 

Sl<0:35> 

Sense  Indicators  or  program  flags  must  be  preserved  if 

XR'[1 :7]<3:17> 

Index  Registers  in  7094 

XR"[A,B,C]<3ll7>  !=  XR[1  ,2,')]<3i  I7> 

*  Index  Registers  for  704,  7090 

Mu  I  t  i  p  1  e^jTaq^Mode 

vrogram  switch  to  force  compatibility  with  704  ^  7090;  only 
3  index  registers  XR{A,B,C'\  are  in  704,  7090 

1 C<3 : I 7> 

*  Instruction  Location  Counter 

Run 

*  indicates  whether  machine  is  executing  instructions 

D  i  V I  de,_jCheck 

ACi_iOve  rf  1  ow 

MQ,_jOverf  low 

InputtjOutputLjCheck 

* 

Trap^jrequest<A :  H> 

Request  to  trap  Pc  from  Pio  ffA...ifM 

Trap  J^ode 

*  Allows  trapping  or  not  of  transfer  instructions  (not 
described ) 

Pc  Console  State 

Keys<0|35> 

*  console  data 

Sense^^SwI  tches<0:5> 

Sense^Light5<0:3> 

Mp  State 

M[0:32768-I]<S.] :35> 

Primary  Memory  of  2^^  w 

Instruction  Format 

in5truction<S,l :35> 

corresponds  to  the  physical  Storage  Register 

Y<21:35>  :=  i ns t ruct i on<21 : 35> 

generally  the  address  part:  used  to  calculate  the  effec- 
tive address:  corresponds  to  the  physical  Address  Register 

T<18:20>  :=  i ns t ruct i on<l 8 : 20> 

the  XR  to  use:     J, . . . 7;  0  means  no  indexing;  corresponds  to 
a  physical  register 

F<12:!3>  :=   i  ns  t  rue  t  ion<l  2  : 1  3> 

indirect  address  specification 

indirect   :=   (F<12:13>  =  II) 

op<S,Iin>     i=  instruction<S,lin> 

op  code;  corresponds  to  a  physical  register 

hi^0p<d;2>     •=  instruction<S,l ,2> 

special  op  codes 

'*  Denotes  subset  ISP,  IBM  704,   7044  series 

Chapter  41  |  The  IBM  7094  1,11 


R<18;35> 

nstruction<l8;35> 

right  half  of  instruction  used  to  select  SI  bits 

0^:17>     :  = 

nstruction<3:  )7> 

Decrement  part  of  instruction,  used  to  directli^  modify  XB's 

C'<12:17>  :  = 

nstruction<12:l7> 

specifies  variable  length  part  of  operation 

C<ln:17>    :  = 

nstruction<10:  17> 

convert  instruction  parameter 

<:<15:17>    :  = 

nstruction<15:  17> 

specifies  character  vosition  in  7040,  7044  or  extends  op  code 

Effective  Address  Calculation  Process 

e<2l:35>;=  (-i  indirect  ->e'; 

effective  address  calculation 

indirect  -i 

instruction<18:35>  ^  M  [e  '  ]<1  8 :  35  >:  r 

ext  e') 

1  level  indirect  addressing 

e'<21  :35>:=   ((T  =  0)  -.Y; 

indexed  effective 

(T  ^  0)   ^Y-XR[T]  ) 

e"<23:35>:=  e 

<23:35> 

sc<28:35>:=  e 

<28:35> 

a  truncation  of  e,    used  for  specifying  number  of  shifts: 
corresponds  to  a  vhysical  register 

XR[T]<3:17>  :  = 

( 

index  registers  are  or'd  together  in  multiple  tag  mode 

^Multiple^TagJIode  ->XR'[T]; 

Multiple^jTagJlode  ( 

(T<I3>^XR"[A])  V  (T<I9>  ->XR"[B])   v  {T<20> 

-^XR"[C  ]))) 

The  description  for  Multiple  Tag  Mode  is  incomplete  for  the  case  of  uritina  in  several  index  registers  at  one  time.    The  onhi 
wan  this  could  he  accomplished  in  the  description  would  be  to  define  each  load  index  register  instruction  as  microprograimed. 

Data  Formats 

5l<S,i  :35> 

logical  data;  unsigned  integer/boolean  vector 

5x<S,l  :35> 

single  precisian  fixed  voint  ( integer)  data 

5X  Sign  := 

sx<S> 

sx  maqnitude<l  :35>  :=  sx<l;35> 

5f<S,l  :35> 
sf  5  ign  ;  = 

sf<S> 

single  precision  floating  point  value  of:  sfuisigrsjsf^mantissa 
^^sf^XDonent 

sf  exponent<l  :8>  :=  200g-  sf<l;8> 

sf  mantissa<D:26>:=  sf<9:35> 

df[0:l ]<S,1 :35> 

double  vrecision    floating  voint  value  of:  df^sigrodf^antissa 

df  sign  := 
df  exponent 

df [0]<S> 

<1  :8>  :=  200g  -  df [0]<1  :8> 

ndf^xvonent 

df  mantissa<D:53>:=  df [0 : 1 ]<9 : 35> 

Instruction  Interpretation  Process 

Run       {instruction  «-M[IC];    IC  <-IC+l;  next 

fetch 

i  ns  true  t  i  on^jsxecut  ion) 

execute 

Instruction  Set  and  Instruction  Execution  Process 

Instructionjsxecution  :=  t 

Operations  on  M 

M[e]      f;  or  H[e]  -  f(M[ep; 

%ll  ( :=  op 

=  600)   ->  M[e  ]  0; 

*  store  zero 

MSP    (:=  (op 

=  -1623)  A  (c  =  7))  ^M[e]<?>  <- 0) 

make  sign  oositive;  704  series  only 

MSH  (:=  (op 

=  -1623)  A  (c  =  6))  ^M[e]<S>  ^I) 

make  sign  minus;  704  series  only 

Block  trayisfer  of  data,  .V—  M  (704  series  only) 

TMT  (:=  op 

=  -I70'l)  ->  (M[AC<2I  :35>:   (AC<21  :35> 
M[AG;3:  I7>-(AC<3:  I  7>  +  e 

+  e'<28:35»  ] 
'<28:35»  ])-, 

528  Part  6  |  Computer  families 


Section  1  I  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


Single  word  data  transwCesion  to       M  \0  ]  *-  Register 

STQ  ( 

=  op  =  -600)  -»  (H[e]  ^  MQ)  ; 

*  store  MQ 

SLQ  ( 

=  op  =  -620)  ->  (M[e]<S,  1  :17>      MCKS.l  :  17>)  ; 

'  store  left  half  MQ 

STO  ( 

=  op  =  601)  ^  (M[e]  ^  ACs)  ; 

*  store 

SLW  ( 

=  op  =  602)  -»(M[e]  ^  ACI)  ; 

*  store  logical  word 

STP  ( 

=  op  =  630)  ^(M[e]<S,  1  ,2>  ^AC<P,1,2>); 

*  store  prefix 

STD  ( 

=  op  =  622)  ^(M[e]<3:  17>  ^  AC<3:  17>)  ; 

*  store  decrement 

STT  ( 

=  op  =  625)       (M[e]<18:2Q>  ^AC<I8:2Q>); 

*  store  tag 

STA  ( 

=  op  =  621)  -  (M[e]<21:35>      AC<2  1  : 35>)  ; 

*  store  address 

STL  ( 

=  op  =  -625)   ^  (M[e]<21 :35>  ^ IC) ; 

store  instruction  location  counter 

5TR  ( 

=  hiuop  =  -1  )       (H[0]<21  :35>  ^  IC;    IC  «-2)  ; 

store  instruction  location  counter  and  trap 

STI  ( 

=  op  =  60l|)  ^  (M[e]  ,-SI); 

store  indicators 

Double  length  data  transmission  to  M  from  A 

OST  ( 

=  op  =  -603)       (M[e]cN[e+n  ^ACsCMQ); 

double  store 

Binary  operation  with  AC:  M[e]  -  AC  b  Mle]; 

ORS  ( 

=  op  =  -602)   ^(M[e]  ^ACI  V  M^e]); 

*  or  to  storage 

ANS  ( 

=  op  =  320)   -(M[e]  i-AC)  A  Mte]); 

*  and  to  storage 

6  bit  character  to  M  from  AC,   (7040  only); 

SAC  ( 

=  op  =  -1623)  ^  (M[e]<c  X  6  : (c  x  6+5 )>  -  AC<30:35>); 

Operations  to  the  AC,MQ,  or  ACOIQ  with  AC,MQ,ACMQ,  Keys  and  M  operands: 

CLM  ( 

=  (op  =  760)  A  (e'  =  0))  ->  (AC<(},P,1  :35>  <-0)  ; 

clear  magnitude 

SSP  ( 

=  (op  =  760)  A  (e'  =  3))  -  CAC<S>  -  0); 

*  set  sign  plus 

SSM  ( 

=   (op  '  -760)  A  (e'   =  3))  -»  (AC<S>  ^  1) ; 

*  set  sign  minus 

CLA  ( 

=  op  =  500)       (AC  ^0;   next  ACs  ^AC+M[e]); 

clear  and  add 

CAL 

:=  op  =  -500)  ^  (AC  ^0;  next  ACl  <-ACl+M[e]); 

clear  and  add  logical 

CLS  ( 

=  op  =  502)   ->  (AC  ^0;  next  AC  ~AC-M[e]); 

clear  and  subtract 

LOQ  ( 

=  op  =  560)  -  (MQ  t-M[e])  ; 

load  MQ 

FNK  ( 

=  (op  =  760)  A  (e'  =  1|))  -•  (MQ  '-Keys); 

enter  Keys 

PIA  ( 

=  op  =  -'i6)       (AC  ^Sl); 

place  indicators  in  AC 

DLD  { 

=  op  =  lihl)  -  (ACsdMQ  -  M[e]aM[e+l])  ; 

double  load 

Operations  with  AC, AC  *-  f(AC) 

CHS  ( 

=  (op  =  760)  A  (e'  "  2))       (AC<S>         AC<S>)  ; 

change  sign 

COM  ( 

=   (op  =  760)  A  (e'   =  6))  -  (AC<Q,P,t  :35>  AC<Q,P,1 

■35>);     ^complement  magnitude 

RNO  ( 

=  op  -  760)  A  (e'  =  10))  -.MQ<1>  -.AC  ^  AC  +  1; 

*  round 

FRN  ( 

=  op  =  760)  A  (e'  =  11))  -  CAC  -  roundCACMQ){5f}); 

*  floating  round 

ALS  { 

=  op  =  767)  -  (AC<Q,P,  1  :35>  ^  AC<Q,P.  1  :35>  X  2^*^)  ; 

*  AC  left  shift 

ARS  ( 

=  op  =  771)       (AC<Q,P,  1  :35>  ^  AC<Q,P,  1  :35>/  2^"^)  ; 

*  AC  right  shift 

LLS  ( 

=  op  =  763)  ^  (ACMQ'   «-ACMQ'  x  2^'^); 

*  long  left  shift 

LRS  ( 

=  op  =  765)  ^  (ACMQ'   ^ACMQV  2^'^); 

*  long  right  shift 

ACMQ'<0:71>  :=  AC<Q,P,1:35>  □  MQ<1:35> 

LGL  ( 

=  op  =  -763)  ^  (ACMQ"  ^ACMQ"  x  2^'^  [logical]); 

logical  left  shift 

LGR  ( 

=  op  =  -765)  ->  (ACMQ"  ^ACMQV  2^'^  (logicaU); 

*  logical  right  shift 

ACMCI"<0:72>  :=  AC<C! ,  P  ,  1  :  35>  0  MC!<S,I:35> 

RQL  ( 

=  op  =  -773)  ->  (MQ  <-MQ  X  2^^"  (rotatel); 

*  rotate  MQ  left 

Exchange 

of  Data  between  registers,  AC,  and  MQ 

XCA  ( 

=  op  =  131)  -  (AC  -MQ;  HQ  <-AC); 

exchange  AC  and  MQ 

I 


Chapter  41  |  The  IBM  7094  1,11  529 


XCL 

(:=  op  = 

-130)   -  (MQ  ^ACI  ;  ACI   ^MQ;  AC<S,Q>  <-0)  ; 

6xchcmge  ZoQicciZ  AC  and  MQ 

6  bit  aharaater 

to  AC  from  M  (704  only) 

PCS 

( :=  op  = 

-1505)  ->(AC<30:35>  <-M[e}<(c  x  6) :  (c  x  6  +  5)>)- 

place  charaatev  fvom  storage 

Binary  operations  with  M,AC^  AC  b  M; 

ADD 

{:=  op  = 

1|00)  ->  (AC  ^AC  +  M[e  ])  ; 

*  add 

ADM 

( : =  op  = 

1,01)  ->  (AC  .-AC  +  abs(M[e]))  ; 

*  add  magnitude 

SUB 

( :=  op  = 

1.02)       (AC  <-AC  -  H[e  ]  )  ; 

*  subtract  , 

SBM 

(:=  op  = 

-1,00)       (AC  <-AC  -  abs  (M[e  ]) )  ; 

*  subtract  magnitude 

MPY 

{ :=  op  = 

200)   -  (ACMQ  <-MQ  x  M[e];  AC<Q,P>  t-0); 

*  multiply 

MPR 

{ :=  op  = 

-200)       (ACMQ  ^MQ  xM[e];  next 

MQ<I>  ->AC   «-AC  +  I;  AC<Q,P>  ^0); 

*  multiply  and  round 

DVH 

( :=  op  = 

220)   -(AC.MQ  .-ACMQ  /  M[e);  next 
Divide  liheck  ->Run  .-0); 

*  divide  or  halt 

DVP 

( :=  op  = 

221)    -(AC.MQ  .-ACMQ  /  M[e]); 

■*  divide  or  proceed;  Divide^check  may  b 

e  set 

ACL 

( :=  op  = 

361)    ->  (ACI    ,-ACl  +  M[e  ])  ; 

*  add  arid  carry  logical  word 

The  folloiying  are 

variable  length  x  and  /  operations.    C  specifies 

the  length  of  divisor  or  multiplier. 

VLM 

( : =  op  = 

lOk)    ->  (ACMQ  ►-MQ  xM[e]  (vll); 

7-7 

var%ab le  Length  muLt%p Ly 

VDP 

( :  =  op  = 

225)    ->(AC,MQ  ^ACMQ/M[e)  (vll); 

varT/Oble  length  dtwde  or  proceed 

VDH 

( :=  op  = 

22I4)    -.  (AC.MQ  .-ACMQ  /  M[e]   (vl);  next 
Divide^heck   -jRun  .-0); 

vax^able  length  divide  or  halt 

Single 

precision  floating  point 

FAD 

(:=  op  = 

300)  -'(ACMQ  .-AC  +  M[e]  (sfl); 

*  add 

FAM 

( : =  op  = 

30l|)   -  (ACMQ  ^AC  +  abs(M[e])  (sfl); 

*  add  magnitude 

FSB 

(:=  op  = 

302)   -  (AC.MQ  .-AC  -  n[e]   (sfl)  ; 

FSM 

( :=  op  = 

306)  ->(AC,MQ  .-AC  -  abs(M[e])  (sfl); 

^  subtract  magnitude 

FMP 

( :=  op  = 

260)   -»  (AC.MQ  ^MQ  X  M[e]  (sfl); 

*  multzply 

FDH 

{ :=  op  = 

2l|0)       (AC.MQ  -AC  /  M[e]   (sfl;  next 
Divldej:heck  -  Run  .-0); 

divide  or  halt 

FDP 

(:=  op  = 

21)1)   -.  (AC.MQ  v-AC  /  M[e]  (sf)); 

■ 

difWde  or  proceed 

Unnormatized  single  precision  floating  point 

UFA 

{:=  op  = 

-300)  -(AC.MQ  .-AC  +  M[e]  (sufl); 

*  add 

UAM 

( : =  op  = 

-301,)  -  (AC.MQ  -AC  +  abs(M[e])  (suf]); 

*  add  magnitude 

UFS 

( :=  op  = 

-302)   -  (AC.MQ  ^AC  -  M[e]  (sufl); 

*  subtract 

USM 

( :=  op  = 

-306)  -(AC.MQ  -AC  -  ab5(M[e])  (suf}); 

*  sid)tract  magnitude 

UFM 

( :=  op  = 

-260)   -<(AC,MQ  -MQ  X  M[e]  (suf]); 

*  multiply 

Double 

orecision  floating  point 

In  DF  operations y  the  SI  are  used  as  temporary  registers  and  will  be  changed. 

DFAD 

(:=  op  = 

=  301)   -  ( 

'>-add 

ACMQ  ^ACMO  +  M[e]aK[e+l]   [df];   SI  ^?); 

DFAM 

( :=  op  = 

=  305)   -  ( 

add  magnitude 

ACMQ  -  ACMQ  +  abs(Mte]nM[e+l ])    [dfl;  SI-?); 

DFSB 

(  :  =  OD  = 

303)  -( 

*  subtract 

ACMO  ^ACMO  -  M[e]t?l[e+|]   (dfl;  SI  ^7); 

nFSM 

( :=  op  = 

-  307)  ->( 

*  subtract  magnitude 

ACMO  <-ACMQ  -  abs{MEe]01[e+l  ])   (dfl:   SI  ^7); 

530 


Part  6  I  Computer  families 


Section  1  [  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


DFMP 

(: 

=  op 

=  261)  -.  ( 

multiply 

ACMQ 

.-ACMQ  X  M[e]aM[e+l]   [dfl;  SI  ^7); 

nFDH 

(: 

=  op 

=  -21.0)  -  ( 

divide  or  halt 

ACMQ 

^ACMQ  /  M[e]aM[e+l]   [dfl;  SI   <- ? ;  next 

Oi 

•J  i 

Je^check  ^  Run  ^  0) ; 

DFDP 

(: 

=  op 

=  -21.1)  ( 

divide  or  proceed;  Divide  check  may  be  set 

ACMQ 

„  ACMQ  /  M[e]aM[e+l]  [dfl;  SI  t-T); 

Unnormalis 

?d  double  precision  floating  point 

DUFA 

(: 

=  op 

=  -301)  ( 

add 

ACMQ 

^  ACMQ  +  M[e]oM[e+l  ]   [duf];  SI  ^7); 

DUAM 

(: 

=  op 

=  -305)  -  { 

add  magnitude 

ACMQ 

-  ACMQ  +  abs(M[e:aM[e+l  ]  )[undf  1;  SI  '-?); 

DUFS 

(: 

=  op 

=  -303)  -  ( 

subtract 

ACMQ 

<-ACMQ  -  M[e]DM[e+l]  [dufi;  SI  <-7); 

OUSM 

(: 

=  op 

=  -307)  -  ( 

subtract  magnitude 

ACMQ 

^ACMQ  -  abs  (M[e]c^[e+1  ]  )(duf  1;   SI  .-7); 

DUFM 

(: 

=  op 

=  -261)  -.  ( 

multiply 

ACMQ 

^ACMQ  X  M[e]aM[e+l]  [duf];  SI  ^7); 

Logical 

ORA  ( 

op  = 

-501)   ^  (ACI   <-ACI  V  M[e])  ; 

or  to  accumulator 

ANA  ( 

op  = 

-320)   -'  (ACI   '-ACI   A  H[e])  ; 

and  to  accumulator 

ERA  ( 

op  = 

322)  -  (ACI  ^  ACI  e  M[e]) ; 

exclusive  or  to  accumulator 

The  convert  instructions  are  not  described  in  detail. 

These  instructions  take  a  table  in  memory,  addressed  by  the  6,  d  bit 

characters 

in  AC  or  MQ  and  form  a  sum  of  products  in  the  AC  or  I 

for  each  character  component  of  the  word. 

cm  ( 

op  = 

Wk)  ^  (AC.MQ  <-  f  (AC,C,XR[n,M[Y:Y+63]))  ; 

convert  by  replacement  from  the  AC 

CRQ  ( 

op  = 

-IS^l)  ^  (AC.MQ  -  f  (MQ,C,XR[1  ],M[Y:Y+63]))  ; 

convert  by  replacement  from  the  MQ 

CAQ  ( 

op  = 

-\\h)  ^  (AC.MQ  ^  f (AC,MQ,C,XR[I ],M[Y:Y+63])) ; 

convert  by  addition  from  the  MQ 

Transmission  between  M^XR{_T]j  and  AC 

If  tag,T,= 

=0,  then  a  no  operation  occurs 

PDX  ( 

op  = 

-73'l)  ^  (XRCT]  -AC<3:17>); 

place  decrement  in  index 

PAX  ( 

op  = 

7311)       (XR[T]  <-AC<21  :35>); 

place  address  in  index 

PDC  ( 

op  = 

-737)  -  (XR[T]  -2'^  -  AC<3:17>); 

place  complement  of  decrement  in  index 

PAC  ( 

op  = 

737)  -  (XR[T]  -2'^  -  AC<2I:35>); 

place  complement  of  address  in  index 

LXD  ( 

op  = 

-53't)  -  (XR[T]  ^M[Y]<3:17>); 

load  index  from  decrement 

LXA  ( 

op  = 

53't)       (XR[T]  .-M[Y]<21  :35»  ; 

load  index  from  address 

LDC  ( 

op  = 

-535)  -^(XR[T]  .-2'^  -  M[Y]^:17>); 

load  complement  of  decrement  -in  index 

LAC  ( 

op  = 

535)  -(XR[T]  .-2'5  -  M[Y]<2I:35>); 

load  complement  of  address  in  index 

AXT  ( 

op  = 

nk)   -.  (XR[T]  >_Y); 

address  to  index  true 

AXC  ( 

op  = 

-77'))   -  (XR[T]  -2'^  -  Y)  ; 

address  to  index  complement 

PXD  ( 

op  = 

-751()   ^  (AC  .-0;  next  AC<3  :  1  7>      R  [T  ]) 

place  index  in  decrement 

PXA  ( 

op  = 

75'l)  ^  (AC  ^  0;  next  AC<2I  :3S>  -  XR[T])  ; 

place  index  in  address 

PCD  ( 

op  = 

-756)   -  (AC   -0;   next  AC<3:I7>^  2  " 

XR[T  ]); 

place  complement  of  index  in  decrement 

PCA  ( 

op  = 

756  )->(AC  .-0;  next  AC<21:35>^  2'^  - 

XR[T]); 

place  complement  of  index  in  address 

SXO  ( 

op  = 

-blk)   -(M[Y]^:I7>  -XR[T]); 

store  index  in  decrement 

SXA  ( 

op  = 

63I4)       (M[Y]<21  :35>  <-XR[T]); 

store  index  in  address 

Chapter  41  |  The  IBM  7094  1,11  531 


SCO  ( 

=  op 

-*  fM Ty l<a •  1 7>          -  XR Ft  11  ■ 

*  store  CO    lement  of  index  in  decrement 

SCA  ( 

=  op 

636)  -.  (M[Y]<21  :35>  ^2'^  -  XR[T]); 

'  Itoll  ''ir lement  "of  \nd^x  Tn  Address 
store  comp  ement  0    xn  ex  xn  ess 

Tvanemiseion 

Sense  Indicators 

PA  I  ( 

=  op 

U^)       (SI   *- AC  1 ) ; 

lace  accumulator  in  indicators 

=  op 

hit])   ->  (S  1   •-  M  Te  1)  ■ 

\oad  indimtorr       ^  ''"'^ 

OA  1  ( 

=  op 

1(3)   -♦(SI   ^  S  1  V  ACI )  ; 

OK*  dociinTul^iof  to  itidiootoPB 

R  lA  ( 

=  op 

-^2)  -*  (S  1      S  1   A  -1  AC  I ) ; 

^SGSt   'i-Tl^ioCltOI'B  f7*OTTI  CLOOXOTHilCLtOX* 

1  1 A  ( 

a  op 

I,])   ->(si   .-SI  ^BACl); 

iytV6i[*'t  iy\iHoo.'toT[*8  ^irofn  ctcouinulrCt'toj? 

OSI  ( 

=  op 

hkl)   -<  (S  1   <-  S  1  V  M  [e  ]) ; 

OT*  B'tOI*Cl0&    "to  i'Tld.'VCQ.'tOTB 

RIS  ( 

=  op 

hhS)      (SI  .-SI  A  -nM[e]); 

reBet  indioatora  from  storage 

MS  ( 

=  op 

ItW)       (SI   .-SI  e  M[e  ]) ; 

invert  indicatora  from  storage 

SIL  ( 

=  op 

-55)  ->  (SI<D:  17>  <-SI  O:  I7>  V  R) ; 

set  indicators  of  teft  hatf 

RIL  ( 

=  op 

-57)  ->  (SI<D:17>  <-SI<0:l7>  A     R)  ; 

reset  "indicators  of  "Left  halrf 

ML  ( 

=  op 

-51)  ^  (Sl<0:17>  <-SI<D:l7>  e  R): 

invert  indicators  of  left  half 

SIR  ( 

=  op 

55)  ^  (SI<I8:35>  «-SI<18:35>  V  R) ; 

set  indicators  of  rigHt  half 

RIR  ( 

=  op 

57)  ->  (Sl<18:35>  •-Sl<18:35>  A  ^  R); 

reset  indT,cator6  of  r^ght  half 

1  IR  ( 

=  op 

51)  -»  (Sl<18:35>  <- Sl<18:35>  ®  R) ; 

tnvert  indvcators  of  rx-ght  half 

Frogram  ftou  control  instructions 

NOP  ( 

=  op 

761) 

no  operation 

HPR  ( 

=  op 

1|20)  ->  (Run  -0)  ; 

*  halt  and  proceed 

HTR  ( 

=  op 

0)  ->  (Run  .-0;   IC  -e)  ; 

*  halt  and  transfer 

TRA  ( 

=  op 

20)  -•  ( IC      e)  ; 

transfer 

XEC  ( 

=  op 

522)  -'(instruction  <-M[e];  next 

execute 

Instruction^xecution) ; 

Conditional  transfers 

T2E  ( 

=  op 

100)   ^  ((AC«J,P,  1  :35>  =  0)   ->  IC  ^e)  ; 

*  transfer  on  zero 

TNZ  ( 

=  op 

-100)  ->  (-,  (AC<C1,P,  1  :35>  =  0)  -.  IC  ^e)  ; 

transfer  on  no  zero 

TPL  ( 

=  op 

120)  -.  (^  AC<S>  -.  IC  .-e); 

*  transfer  on  plus 

TMI  ( 

=  op 

-120)  -.  (AC<S>  ^  IC  .-e)  ; 

transfer  on  mt-nus 

TOV  ( 

=  op 

lllO)  -  (AC^verflow  ->  1  C      e  ; 

transfer  on  overflow 

AC^jDverf  low  .-  0) ; 

TNO  ( 

=  op 

-lliO)  -.       AC^verflow  -»  1  C  .- e  ; 

transfer  on  no  overflow 

AC,_,overf  low  ^  0) ; 

TQP  ( 

=  op 

162)  -.  (-,  MQ<S>  ->  IC      e) ; 

traKsfer  on  MQ  p Zus 

TQO  ( 

=  op 

161)  ^  (MCLoverflow  ^  IC  .-e; 

*  transfer  on  MQ  overflow 

n(i_,overf  low      0)  ; 

TLQ  ( 

=  op 

kO)  ^  ((AC  >  MQ)  ^  IC  .-e); 

*  transfer  on  low  MQ 

TIO  ( 

=  op 

h2)  ^  ((ACI  =   (ACI  A  SI))       IC  ^e); 

*  transfer  when  indicators  on 

TIF  ( 

=  op 

1|6)  -»  ((0  =  (ACI  A  SI))  ^  IC  e-e); 

transfer  when  indicators  off 

Index  manipulation  and  control  and  subroutine  calling 

TSX  ( 

=  op 

7M  ^  (XR[T]  .-  2  '  -  1  C;    1  C  ^  Y)  ; 

*  transfer  and  set  index 

TSL  ( 

=  op 

-1627)  ->  (M[e]<21  :35>  ^  IC;  1 C      e  +  1 ) ; 

*  704 

Loop  control 

TXI  ( 

=  hi^op  =  1  )  ^  (XR[T]  .-  XR[TJ  +  D;    IC  ^  Y)  ; 

*  transfer  with  index  incremented 

TXH  ( 

=  hi^op  =  3)  -  ((D      XR[T])  -  IC  -  Y)  ; 

^  tray'.sjer  on  index  high 

532 


Part  6  I  Computer  families 


Section  1  |  The  IBM  701 


■7094  II  sequence,  a  family  by  evolution 


TXL  ( 

:=  hi^p  =  -3)       ((D  2  XR[T])       IC  ^Y)  ; 

tTQ.TlSJB'P  on   X-tlO-QX    tOW  CP  0(^liClX 

Tlx  ( 

:=  hiuop  =  2)  ^  ((XR[T]  >  D)  -.  {XR[T]  ^XR[T]  -  D; 

*  tpcmsf sv  on  indsx 

IC  ^Y)); 

TNX  ( 

=  hi„op  =  -2)  ^  ((XR[T]  >  D)  ^XR[T]  ,-XR[T]  -  0; 

'tTQ.Yisf&T  on  no  indsx 

(XR[T]  £  D)  -,  IC  ^  Y)  ; 

Skip  tests 

MIT  ( 

=   (op  =  -13'll)  A   (c  =  7))  ^  (H[e]<S>  ^  IC  ^  IC  +  1)  ; 

■ 

S'toY'CXQG  TTiz-nus  tsstj  704  SBT'T'SS  onTy 

PLT  ( 

=  (op  =  -I3'i1)  A  (c  =  6))  -  I-,  M[e]<S>  ^  IC  »-  IC  +  1)  ; 

storage  plus  test;  704  series  only 

CCS  ( 

=  ((op  =  -13'il)  A  (c  <  6))  ( 

compare 

(AC<30:35>  =  M[e]<(c  X  6) : (c  X  6  +  5)>)      IC  ^  IC  +  1 ; 

ckaraater  with  storage;  704  series  only 

(AC<30:35>  <  M[e]<(c  X  i):{c  f  i  +  5)>)  ^  IC  ^  IC  +  2)) ; 

PBT  ( 

=  (op  =  -760)  A   (e"  =  D)  ^  (AC<P>  ^  IC  ^  IC  +  1); 

*  F  hit  test 

DCT  ( 

=  (op  =  +760)  A  (e"  =  12))  ^  (Divide^check  -  IC  ^  IC+  !)•*  Divide^heak  test 

LBT  ( 

=   (op  -  +760)  A   (e"  =  D)  ^  (AC<35>  -»  IC  <-  IC  +  1); 

*  low  hit  test 

ZET  { 

=  op  =  +520)  ^  ((H[e]  =0)  ^  IC  ^  IC  +  1); 

*  storage  zero  test 

nzT  ( 

=  op  =  -520)       ((M[e]  7^  0)  -  IC  ^  IC  +  1); 

*  storage  own  zero  test 

CAS  ( 

=  op  =  +31)0)  ^  ( 

*  compare  AC  with  storage 

{AC 

s  =  M[e])  -»  IC  ^  IC  +  1  ; 

(AC 

s  <  M[e])  ^  IC      IC  +  2)  ; 

LAS  ( 

=  op  =  -3hO)  -,  ( 

*  logical  compare  AC  with  storage 

(AC<(},P,1:35>  =  M[e]<S,l  :35>)  ->  (IC  ^  IC  +  1): 

(AC<Q,P,1  :35>  <  M[e3<S,l  :35>)  ^  (IC  <-  IC  +  2)); 

SVfT  ( 

=  (op  =  760)  A  (e'<9:)i,>  =          ^  ( 

Sense^Switches  test 

5en5e^Switche5<e'<15:  17»       1 C  <-  1 C  +  1); 

SLF  { 

=   (op  =  760)  A    (e'  =  lllO))  -  (Sense^Light5<0:3>  <-  0)  ; 

Sense^lights  off 

SLN  ( 

=   (op  =  760)  A    (e'<9:llt>  =  111)  A   {e'<l5:17>      0)  )  ^  ( 

Senses-lights  on 

Sens^L  i  ghtKe '<  1  5  :  1  7»  <-  1); 

'LT  (: 

=  (op  =  -760)  A   (e'<9:  lifc.  =■  lip.))  ^  ( 

Sense^lights  test 

Sense^L  ight5<e '<15  :  1  7»  ^  (IC  —  IC  +  1;   Sense^L  i  gh  1 5<e '<1 5 

17»  ^0)); 

ETM  (: 

=   (op  =  760)  A   (e'  -  7))       (Trap^Hode  -  1); 

enter  TrapJ^ode 

LTM  (: 

=  (op  =  -760)  A  (e'  =  7))  ^  (Trap^Mode  ^0); 

leave  TvapJ^ode 

EMTM  ( 

:=  (op  =  -760)  A  (e'  =  16))       (Multiple^Tag„Mode  -  1); 

enter  Multiple Jlag^ode 

LHTH  ( 

:=  (op  =  760)  A  (e'  =  16))       (Multlple„Tag„Mode  -  0); 

leave  Multiple JTag^ode 

)                     end  Instruction^xecution 

Chapter  41  {  The  IBM  7094  I. II  533 


APPENDIX  2    IBM  7909  DATA  CHANNEL  ISP  DESCRIPTION  (A  PIO) 


Appendix  2 

IBM  7909  Data  Channel   ISP  Description  (a  Pio) 

Although  the  following  description  is  of  a  Pio^  signals  generated  in  ^a,        and  K  are  necessary.    Apvendices  Jj  3,  and  4  are 
also  necessary  for  a  complete  description.     The  Ms  attached  to  K  controls  the  vrecise  time  inforrvatton  flows. 
Pio  State 

CC<21:35>  Cormand  Counter:  IS  bit  cormand  (or  instruction)  counter 

containing  the  location  of  the  next  command 

AC<21 :35>  Address  Counter:  during  vector  data  transfers  AC  contains 

the  address  of  the  next  data  word  to  transfer.    During  a 
transfer  cormand  AC  is  set  to  the  address  of  the  next  command 

AR<S,1:35>  Assemblu  Register:  a  buffer  for  data  flow  between  the  data 

register  and  the  device  control  registers 

ARc[0:5]<0:5>  :=  AR<:S,1:35>  character  arrau  defined  bi/  AR;  a  character  is  normallv 

selected  ABdiASP^ 

CTC<0:5>  Control  Counter:  a  6  bit  register  which  can  be  loaded  and 

stored  by  the  ISP 

WC<3:17>  Word  Counter:  a  counter  controlling  the  number  of  words  left 

to  transfer  during  a  cormand 

Data  transmission  modes  'n  Pio  for  thf-  K-Picf  dialogue} 
These  control  the  flow  direction  and  data  types  between  K  and  Pio,    Although  not  described  as  such,  each  indicator  is  rm4tually 
exclusive  of  the  others. 

SN)  Sense  Indicator:  K  is  transmitting  sense  data  to  Pio, 

WRI  Write  Indicator:  K  is  receiving  data  from  Pio. 

RDI  Read  Indicator;  K  is  transmitting  data  to  Pio. 

^'3  i  t  bit  denotes  a  halted  condition  in  Pio;  instructions  are  not 

executed 

IL  :=  '♦Zg  InterruDt  Location  for  Pio  i^A  to  interrupt  itself.    Each  of 

the  8  Pio's  have  svecial  locations,    two  locations^  IL^ 
IL+lj  are  reserved 

Interrupt Jtequest  :=  ({CKC<1  :6>  A  CKCI<30:35>)  7*  0)  signifies  a  request  to  interrupt  Pio  from  K  or  within  Pio 

Pc^Trap J^equest  signifies  a  reauest  to  trav  Pc  from  Pio 

Interrupt  Mode  bit  to  denote  thjxt  an  interrupt  vrogram  is  running  in  Pio 

CKC<!:6>  Check  Conditions  in  K  that  cause  an  interrupt  of  the  Pio 

CKC<l>/lnput„Output^Check/I^O^Check 
CKC<2>/Sequence^Check 
CKC<3>/K^Unusual ^End 
CKC<J^t  :5>/At  tent  Ion  Condi  t  Ions<l  :2> 
CKC<6>/K^Check 

CKCI<30:35>  a  masP  to  inhibit  Pio  interruvts  from  CKC 

The  CKC  indicators  are  described  as  follows: 
Input  JDu  tput  ^Check 

This  condition  occurs  when  the  channel  fails  to  obtain  a  storage  reference  cycle  in  time  to  satisfy  demands  of  the  attached 
10  device.     The  condition  is  also  monitored  in  the  Pc.    IJJJ^heck  is  turned  off  when  an  LIP  or  LTPT  cormand  is  executed  or  when 
the  Pc  executes  an  RSC  or  RIC  instruction. 

When  an  IJ^J^heck  occurs^  the  adapter  is  disconnected  and  an  interrupt  occurs  when  the  K^nd  signal  is  received  from  the 
adapter  (K).     The  command  counter  contains  the  location  plus  one  of  the  present  command.    The  address  counter  contains  the  loca- 
tion plus  one  or  two  of  the  last  word  transrHtted  if  the  operation  was  a  write  or  control^  or  the  location  plus  one  of  the  last 
word  transmitted  if  the  operation    was  a  read  or  sense. 

If  an  IJ)jCheck  occurs  while  the  .^hanrel  is  in  internet  rr'c-^?,  the  IJ^^Check  is  not  recognized  and  :s  Kct  saved. 


534  Part  6  |  Computer  families 


Section  1  |  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


Sequence S^eck 

A  Sequenc&-,Check  indicates  an  invalid  sequence  of  channel  cormands.    If  a  Seauence^Check  occurs  during  data  transmission^ 
the  adapter  is  logically  disconnected  and  the  interrupt  occurs  when  the  K^nd  signal  is  received. 

The  following  instructions  cause  a  Sequenoe^Check  and  a  channel  interrupt.     (The  checks  are  not  described  in  the  ISP 
description. ) 

1.  If  a  CTLW,  CTLR,  or  SNS  is  followed  hy  CTL,  CTLW^  WTP,  TWTj  or  SNS. 

2.  If  an  SNS  or  CPYP  is  followed  by  any  command  other  than  a  CPYP^  CPW,  TCH,  or  TDC. 

3.  If  a  TCH  or  TDC  following  an  SNS  or  CPYP  transfers  control  to  any  command  other  than  a  CPYP,  CPYD^  TCHj  or  TDC, 

4.  If  a  CPYP  or  CPYD  has  not  been  properly  preceded  by  a  CTLW^  CTLBy  or  SNS. 

KJJnusualLuEnd 

This  signal  indicates  an  error  condition  recognized  by  K,    It  causes  an  immediate  interrupt  to  Pio.     The  signal  may  be 
determined  by  sensing  the  K  error  indication. 

Attention  Conditions 

This  is  a  signal  indicating  a  change  in  status  of  the  attached  input  output  device.    For  example^  during  disk  operations^  an 
attention  signal  is  generated  when  an  access  mechanism  has  completed  a  seek  ope7*ation.    The  particular  access  mechanism  that 
generated  this  indication  may  be  determined  from  sense  data. 

Kj:heck 

Adapter  check  (KSheck)  indicates  an  error  and  is  recognised  by  the  7909,  but  does  not  necessarily  indicate  a  K  malfunction. 
The  conditions  which  cause  an  adapter  check  are: 

1.  Circuit  failure  occurs  in  the  ASK  or  CP. 

2.  The  character  rate  of  the  attached  10  device  exceeds  the  capability  of  the  channel, 

3. .  The  adapter  (K)  is  not  operational.    This  indication  occurs  if  power  is  off  on  the  adapter  and  an  attempt  is  made  to 
read,  write,  control  or  sense. 

Hardware  Switches 

These  gates  route  information  among  the  registers  on  a  selected  basis.     They  are  not  under  control  of  the  program  and  are 
not  registers. 

Storaqe  Bus  Swi t ches <5 , 1 : 35>  These  36  switches  (and/or  gates)  provide  the  data  path  to 

and  from  the  7608  Multiplexor  for  data  or  command  entry  into 
the  Pio. 

Channel  Address  Swi tches<21 : 35>  These  JS  switches  provide  the  Mp  with  address  information. 

Address  information  is  selected  from  the  Address  Counter  or 
the  Command  Counter. 

Character  Swi tches<D : 5>  These  6  bit  switches  enable  the  character  to  be  read  from 

or  written  into  the  Assembly  Register. 

Pio  State  (not  in  ISP) 
Hardware  registers  not  in  ISP  but  used  in  the  description  and  the  Pio, 

Operation  Register.  The  register  containing  the  operation 
part  of  the  instruction.     OP  is  made  up  from  i<S,l:3,19>. 

Data  Register,    A  buffer  for  data  flow  between  M  and  the  AR. 

Character  Ring.  A  register  to  control  the  timing  or  trans- 
mission into  AR. 

Assembly  Ring,     The  counter  to  control  the  gates  to/from 
AR  from/boK.    Data  are  sent  to  or  received  from  the  control, 
K,  one  6~bit  character  at  a  time  via  the  Character  Switches 
under  control  of  ASP, 


instruction;  normally  IBM  calls  these  commands  because  a  Pio 
executes  them 

indirect 

operation  code 

address 

count  part 


mask 


DR<S.l ;35> 
CR 

ASR. 


Instruction  Format 
i<S.l:35> 

f  ;=  I<18> 

op<0:^>  ;=  i<S,l  i3J9> 

y<0:  l^t>  1=  i<21  i35> 

c<D:U>  1=  i<3i  17> 

c'<0:2>  :=  f<3'5> 

m<0:5>  :=  i<12:17> 


Chapter  41  |  The  IBM  7094  I. II  535 


e<21i35>    ;=  C-,  f  ->y;  f  ^M[y]<21  ;35>)  ;  2  level  of  indirect  addresszna 

Mp  State 

M[0:32768-1 ]<S , 1 : 35>  Computer's  primarv  memory 

Instruction  Interpretation  Process 

-I  lnterr,upJ:^equest  A  — i  Wa  i  t  ^(instruction  *-M[CC];  fetch^  no  interrupt 

CC   ^CC+1  ;  next 

I  ns  t  ruct  i  orvjexecut  i  on) ;  execute,  no  interrupt 

[nterrupt^equest  \ —\  I  nterruptcjmode      (  interrupt  process 

(M[IL]<21:35>        CC;M[IL  ]<3:17>  «-CC: 
Interrupti^mode       1  ;   next  CC  ^-IL+1); 

Pio  Interrupts  and  Pa  Traps 

The  Pio  is  capable  of  having  its  stored  program  interrupted  independently  of  other  P's,    This  operation  is  separate  and 
distinct  from  a  data  channel  trap  in  which  Pio  interrupts  the  Pc.    On  recognition  of  an  interrupt  condition  the  Pio  stores  the 
contents  of  the  command  and  address  counters  in  a  fixed  memory  location,  IL,  and  then  executes  the  cormiand  located  in  the  next 
location. 

If  the  7909  channel  is  to  be  diverted  from  normal  command  execution  seauence,  the  comjond  in  the  fixed  location  must  be  one 
that  will  change  the  contents  of  the  command  counter  (TCH,  LIPT,  or  successful  TDC  or  TC?-*) .    If  this  command  is  other  than  a 
successful  transfer,  the  channel  executes  it  and  resumes  operation  at  the  location  immediately  foliating  the  location  where  the 
interrupt  occurred.     If  the  command  at  the  fixed  location  is  a  WTP  or  TWT,  the  channel  suspends  operation  as  described  in  the 
channel  command  section,  but  the  command  counter  contains  the  location  plus  one  of  the  cormxznd  responsible  for  the  interrupt. 

Interrupt  conditions  are  stored  in  a  six-position  register  in  the  data  channel  and  may  be  examined  vith  the  TCf-f  cormand. 
Any  combination  of  interrupt  conditions  causes  an  interrupt;  however,  once  interrupted  the  channel  is  placed  in  interrupt  mode 
arid  further  attempts  to  set  the  interrupt  condition  or  to  interrupt  are  inhibited.  The  channel  remains  in  interrupt  mode  until 
an  LIP  or  LIPT  command  is  executed  by  the  channel  or  an  PIC  instruction  is  executed  by  the  CPU.  If  a  channel  is  in  interrupt 
mode  and  an  BSC  instruction  is  executed  by  the  CPU  before  the  channel  executes  a  LIP  or  LIPT  corTnand,  the  interrupt  condition 
register  is  reset  but  the  channel  remains  in  interrupt  mode.  An  LIP  or  LIPT  cormand  or  a  PIC  instruction  is  the  only  program 
means  available  to  cause  the  channel  to  exit  from  interrupt  mode  and  become  receptive  to  further  interrupt  conditions. 

Interrupts  are  also  inhibited  if  channel  trap  is  in  process  on  that  channel.     This  inhibiting  persists  until  either  an  HSC 
or  STC  instruction  (depending  on  whether  the  channel  was  enabled)  is  executed  by  the  Pc. 

This  command,  when  decoded  by  a  channel  not  prepared  to  read  or  write,  causes  a  sequence  check  and,  thus,  a  channel  interrupt. 
If  the  channel  is  prepared  to  read  or  write,  this  command  causes  a  words  to  be  transmitted  between  the  channel  and  starting 
with  Af[e].     Data  transmission  continues  until   c  is  reduced  to  zero  or  a  K^End  signal  is  received  by  the  channel.    In  either  case, 
the  channel  read  or  write  indicator  is  reset.     If,  while  a  CPYD  is  being  executed  a  K^End  signal  is  received  before  the  count  is 
reduced  to  zero,  the  channel  read  or  write  indicator  is  reset,  and  the  channel  obtains  a  new  cormand  from  the  next  sequential 
location. 

If  the  next  command  is  other  than  a  copy,  the  channel  executes  that  cormand.     If  the  next  corrmmd  is  a  copy,  the  channel 
interrupts  on  a  program  sequence  check.     The  last  word  transmitted  to  storage  under  CPYD  control  remains  in  the  assembly  register 
if  a  K^nd  signal  is  received  before  the  word  count  reaches  zero. 

If  the  count  for  the  CPYD  goes  to  zero  before  the  K^J^i"  signal  is  received,  the  channel  initiates  a  disconnect  but  does  net 
get  the  next  sequential  command  until  a  K^End  or  KU/nusual^End  signal  is  obtained.    In  general,  when  operating  under  CPYD  control,  the 
channel  does  not  obtain  the  next  sequential  command  until  either  a  K^nd  or  a  K^U-misual^End  signal  causes  an  interrupt. 

Instruction  Set  and  Instruction  Execution  Process 
The  following  control  commands  transmit  instructions  (orders)  or  operation  information  to  K.    Information  is  sent  to  K  from 

starting  with  the  high  order  6  bit  character  and  continues  until  a  K^nd  is  received  by  Pio  from  K.    If  more  than  one  control 
word  is  required,  the  naxt  words  come  from  Mie+l,e+2, . .  .\ . 

Eor  CTL,  CTLH,  and  CTLW  instructions,  the  control  words  are  first  transmitted.    Next  the  Pead  or  V^ite  indicator  is  set  in  Pio. 
Instruction  execution   :=  ( 

CTL     (:=  op  =  01000)       (AC  ^e:  control 

Moveu.word:->f  romwH ;   ASP  ^0;  next 

MoveujContro!,_iChar^ta-.K)  ; 
CTLR  (:=  op  =  01001)       (AC  ^e:  control  and  read 

Move^word^from^H:  ASR  ^0-  next 

Move^controI^char^to^K;   RDI  ^1); 
CTLV  (:=  op  =  01010)  ->  (AC  *-e:  next  control  and  write 

Move^ordu^f  rom^jM;   ASR  -^Q-  next 

rfove^control^char^to-K;  WR I  ^1); 


536 


Part  6  j  Computer  families 


Section  1  I  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


next 

Koend^a  it); 


(AC  t-e; 

Copy^da  ta^^b  I  ocl<  : 
RDIDSNIDWRI  <-0; 
(AC  <-e; 

Copy^jdata  J)Iocl<) ; 
(SNI  ^  \); 

foZloued  by  a  copy  command. 


CPYD  (j=  op  =  10H01) 

CPYP   (:=  op  =  100*0) 

SNS     (:=  op  =  01011)  - 

Execution  of  this  command  must  I 
register  through  AP  and  DP  to  M. 

SMS     (:=   (op  =  11100)  A   (c'=0))  ^CKCI  ^e<29:35>: 

LCC     (:=  op  =   11011)  ->  (AC  ^e;  next 

CTC  ^AC<30:35>)  : 

TDC     (:=  op  =  11010)  ->  (AC  ^e;  next 

(CTC  =  0)  -»  : 

(CTC      0)  ^  (CTC  f-  CTC-1  ;  CC  <- AC))  ; 
ICC.     (:=  op  =  llHl)  ^  ( 

(0  <  c'  <  7)  ^ARc[c'  ]^  CTC; 

(c'  =  0)  -»ARc[5]  <-CKCI; 

(C  =  7)  ^  ;)  ; 
TCM     (:=  op  =   101$l)  -»  ( 

((c'  =  0)  A  ^  !<ll>  A  (m  =  CKC))  ^  (CC  ^e); 

((c'  =  0)  A  !<!!>  A  i(<n  A  CKC)  =  m) )   ^  (CC  ^e); 

((0  <  c'  <  7)  A  ^  i<l  1>  A   (m  =  ARc[c]))   ^  (CC  ^e)  ; 

((0  <  c'  <  7)  A  i<">  A  ((m  A  ARc[C])     -  m) )  -  (CC 

((c'  =  7)  A  (m  =  0))  -,  (CC  -e)); 


TCH 
LAR 
SAR 
XMT 


op 


001$0)   ^  (CC 


5); 


=  op  =  01  100)  -.  (AC 

=  op  =  01101)       (AC  <-e 

=  op  =  00011)   ^  (AC  <-e 


next  AR  ^M[AC]); 
next  M[AC  ]  <-AR)  ; 
V'C  <-c;  next 


copy  and  disconnect 


copy  and  proceed 
sense 

The  data  in  K's  sense  indicators  are  sent  via  the  K^ata 

set  mode  and  select 
load  control  counter 


M,J)lock  jTOve) 

XMT  is  actually  a  vector  move  within  Mp. 
XMT     (:=  op  =  000$1 )       ({c  ^  0)  ^  ( 

M[ei  (e  t  c  -  1)  ]  ^M[CC;(CC  +  c  -  1)  ]i 
WC  ^0;  AC  t-AC  +  c; 
CC  ^CC  +  c))  ; 
WTR     (:=  op  =  OOOtO)   ^  (AC  ^e;  Wait  <-l); 
TWT     (:=  op  =  OHIO)   ^  (AC   ^e;  Wait  ^1; 

Pc^/rap J^equest  ^1); 
LIP     (:=  op  =  1 1001)  ^  ( 
CC  ^rtlLJ<2I  ;35>( 
CKC  ^0;    Interrupt Jtode  <-0); 
LIPT   (:=  001*1)   -»  ( 
(CC  .-e;  CKC  .-0; 
I  nterrupti^Mode  <—0) 
) 


transfer  and  decrement  counter 


insert  control  counter 


transfer  on  conditions  met 


transfer  in  channel 
load  assembly  register 
store  assembly  register 

an  instruction  to  move  c  words  in  1^[CC:  (CC  +      ]    to  M[e:  (e  +  cj] 


vector  move 


fix  end  conditions 

watt  and  transfer 
trap  and  wait 

leave  interrupt  program 


leave  interrupt  program  and  transfer 


end  Ins  true  tion^xecution 


Chapter  41  j  The  IBM  7094  1,11 


537 


K,  PiOf  and  M  Data  Movement  Processes 
The  following  proaeseee  define  the  movement  of  characters  and  words  among  the  registers  and  Memory,     The  vrincivle  activity  is 
copy u3atad>l oak.     On  writing,  a  word  is  taken  from  M  and  placed  in  Pio^  then  transferred  character  by  character  to  K.  On 
reading,  a  character  ie  taken  from  K  and  assembled  in  Pio,  then  transferred  as  a  word  to  M.     The  following  processes  move 
either  characters  or  words  in  a  direction  relative  to  Pio. 

Move^harJ^oJC 

writing  into  K 

Move  jjontro  I  jyharJi^oJC 

setting  up  instruction  in  K 

Movejuhar^romJC 
Movejjord^toj^ 
Move  jjord  ^romj^ 

reading  from  K 
writing  into  M 
reading  from  M 

Mjylockjnove 

read  M,  write  M  on  a  word  by  word  basis 

K^ndjtkzit 

process  to  wait  for  K  end  signals 

Copy^ata Jjlock   :=  ( 

RDI   ^  (MoveuChar^fromJ<;  A5R  *- O) ; 

SNI       (Move^har^fromJ<;  ASR  <- 0) ; 

WRI       (Move^word^f  romjl;  ASR  .-0;  WC  ^  WC  -  I 

next 

Move^char^to  J<) ) 

Move^jchar^to J<  :=  ( 

K  *-Pio  *-M  data  movement 

KJnd  V  (WC  =  0)   ->  : 

stop  at  end 

-1  KJnd     A  (WC  1*  0)  A  KJ)ataJ^q  ( 

transmit  a  char 

(ASR  =  0)  ^  Move^ordu^f  rom^;  WC  '-WC-I;  next 

KJ)ata  .-ARc[ASR];  ASR  ^ASR  +  I;  next 

Move,^char^  to^K) ; 

K^nd     A  (WC  /  0)  A  ->  KJ)ataJ^q  -»  ( 

idle  till  char  arrives 

Move^har^to  J<) ) 

Move^ontrol  ^jchar^to^K   :=  ( 

K  *-Pio  *-M 

KJEnd   ^  ; 

stop  at  end 

K^nd  A  KJ)ataJ^q   ^  ( 

transmit  a  char 

(ASR  =  0)   ^  Move.j^ord^f  romj^;  next 

KJ)ata  ^ARc[ASR];  ASR  ^ASR  +  1;  next 

Moveucontrol  ucharutoJ<) 

-1  K^nd  A  -,  KJJataJ^q  ->  Move  ^con  t  TO  I  ^ha  r  ^to^K) ' 

idle  J  till  char  arrives 

MovCuchar^f roniJ<;  :=  ( 

M  *-Pio      K  data  movement 

K^nd  V  (WC  =  0)     -»  ; 

stop  at  end 

K^nd  A  (WC       0)   A  KJ)ataJlq  ^  ( 

receive  a  char 

ARc[ASR}  *-KJ)ata;  ASR  t-ASR  +  I:  next 

(ASR  =  0)   ^  (Move^ord^toJI;  WC  ^  WC  -  1);  r 

ext 

Move^har^f  rom^K)  ; 

K^nd     A  (WC  /  0)a  -^    KJ)ataJ^q  ( 

idle  till  char  arrives 

Move^har^f  rom^K) ) 

Move j^/ord^to :=   (DR  f-AR;  next 

M  f-Pio  data  movement 

M[AC]  -DR:   AC   ^  AC  +  1) 

Movea^ord^f romJ^   :=   {AR  —DR;  next 

Pio  *^M  data  movement 

DR  ^M[AC];  AC  -AC  +  I) 

M Jjlock jnove   :=  ( 

M  t-M  block  move  process  for  moving  WC  words  within 

M,  i.e.. 

538  Part  6  |  Computer  families 


Section  1  |  The  IBIVI  701-7094  II  sequence,  a  family  by  evolution 


(WC  =  0)  -  ; 

MiCC: (CC  + 

WC)  ]  ^  MIAC:  (AC  +  IK)  3 

(WC  4  0)  -  (DR  .-M[CC];  CC  ^  CC  +  1  ;  next 

M[AC]     -DR;  AC  .-AC  +  1  i  WC  -  WC  -   1  :  next 

M^block^^move) } 

Kl^nd^_A^^a  it   ;=  ( 

Process  to 

idle  until  K  transmits  an  end  signal 

(K„End  V  K^Unu5ual„End)  -iK^ndjrait; 

(K^End  V  K„Unusual^End)  -» ^ ) 

Chapter  41  j  The  IBM  7094  1,11  539 


APPENDIX  3    K('HYPERTAPE)  AND  'KDISK  ISP  DESCRIPTIONS 


Appendix  3 


K( 'Hypertape)  and  K(disk)   ISP  Descriptions 


These  K  depend  on  control  and  state  definitions  from  Pio  of  Appendix  2. 


K  State 


Kj)p<D:l>,j 
KJata<0:5> 


the  operation  or  instruction  register  in  K 


KJJata  J^q 


data  hu^^er  in  V;  used  for  transmittina  and  receiving  characters 

used  to  control  data  floi>  between  ABciASR]  and  KJ)ata:  signal 
in  K  denoting  KJkcta  reauires  nei>  data  if  writing^  or  has  a 
full  data  buffer  if  reading 


KuEnd 


set  by  K  at  the  completion  of  reading  or  writing  a  block  of  data 


KJUnusual  J.r\ii 


set  by  K  when  an  error  is  detected  during  writing  or  reading  and 
data  flow  must  be  terminated 


The  following  sense  data  bits  for  tape  originate  in  Ms  and  K.  These  registers  can  be  read  by  Pio  using  the  Pio  SUS  instructions 
Some  of  the  bits  are  set  using  the  CTL,  CTLB,  or  CTLV  instructions  from  Pio  as  control  words 


SDT[0]<1 -/Operator  Required  :=  ( 

S0T[0>:13>/Selected  Drive  Not  Ready  V 
SDT [0  ]<1  5>/Selec ted  Drive  Not  Loaded  V 
SDT[0]<16>/Selected  Drive  File  Protected  V 
SDT[0]<17>/0peration  Not  Started) 
SDT[0]<-3>/Proqram  Checl<   :=  ( 

SDT[0]<t9>/lnval !d  Order  Code  v 
SDT[0]<21>/Selected  Drive  Busy  V 
SDT[0]<22>/Selected  Drive  at  Beginninq  of  Tape  v 
SDT[0]<23>/Selected  Drive  at  End  of  Tape) 
SDT[0]<'i>/Data  Checl<  :=  ( 

SDT[0]-35>/Correction  Occurred  V 
SDT[0]<27>/Channel  Parity  Checi<  V 
SDT[0]<28>/Code  Checl<  V 
SDT[0]<29>/Envelope  Checl<  V 
SDT[0]<31>/0verrun  or  Character  Lost  Checl<  V 
SDT[0]<33>/Excessive  Skew  Check  v 
SDT [0]<3'4>/T rack  Start  Check  or  Clock  Lost  Check) 
SDT[0]<5>/ Exception  Conditions  :=  ( 

SDT[1 ]<1>/Selected  Drive  Read  a  Tape  Hark  v 
SDT[I  ]<3>/Selected  Drive   in  End  of  Tape  V/arning  Area) 
SDT[o:<7,9: 1 1>/Selected  Tape  Unit  Address  0:3 
SDT[1 ]<7>/Read  Section  Busy 
SDT[I ]<9>/Wri te  Section  Busy 
SDT[1 ]<1 1>/Backward  Mode 

SDT[1  )<13,15:i7J9,21  :23,25,27>/Drive  At  tent  i  on  [0 : 9  ] 
SDF[0:1 ]<S,I:35>  sense  data  for  the  Kl 'Disk) 

SDF[0]<3>/Program  Check  :=  ( 

SDFtO ]<7>/ 1 nval i d  Sequence  V 

SDF[0]<9>/lnval id  Code  V 

SDF[0]<IO>/Format  Check  v 

SDF[0]<l l>/No  Record  Found 

SDF[0]<13>/lnval id  Address) 
SDF[0]^'l'/Data  Check   :=  ( 

SDF[ 0]<  1 5>/Response  Check  v 

SDrfoKlfe^/Data  Compare  Check  V 

SDF[0]-~17-'/Parity  or  Cyclic  Code) 
SDF[0]<5> /Except  ion  Condition  :=  ( 

SDF[  0]<  19> /Access  Inoperative  V 

SDF[0]<2I>/Access  Not  Ready  V 

SDFtO]<2Z>/Disk  Circuit  Check  V 

SDF[  0]<23^/Fi  le  Circuit  Check 
SDF[0)<7>/5ix  Bit  Mode/Status  Bit 

SDF[  0]<31  .33:35^aSDF[  1]<1  ,3:5,7. 9^/Acce5S  0,  Module[0:9] 


SDT  [0:1  ]<S,1  :35> 


sense  data  for  X!  'Hupertape) 


540 


Part  6  I  Computer  families 


Section  1  I  The  IBM  701-7094  II  sequence,  a  family  by  evolution 


Control  Orders^  i.e 
Instruction  NqmeQ  and  Nupibers  for  K(disk) 

These  instructions  are  set  in  the  K  op  register  by  the  CTL  instructions  from  Pio.     The  instructions  are  then  executed  by  the 
K's.     They  wilt  only  be  given  as  names,  mnemonics,  and  operation  codes. 

DNOP  ( 

:=  Kijop  = 

AAj 

no  operation 

DREL  ( 

:=  Kjjp  = 

AAj 

release 

DE  BM  ( 

:=  Kjjp  = 

A8] 

eight  bit  mode 

DSBM  ( 

A9) 

six  bit  mode 

DSEK  ( 

:=  K^p  = 

8A) 

seek 

D«SR  ( 

:=  KjDp  = 

82) 

prepare  to  vertfy  (single  record) 

DWRF  ( 

:=  K^^op  = 

83) 

prepare  to  wr%te  format 

DVTN  ( 

:=  Kuop  = 

81.) 

prevare  to  verify  (track  wi th  no  addresses  J 

DVCY  ( 

=  Kijop 

85) 

prepare  to  verify  (cylinder  operation) 

DWRC  ( 

-  K„op  ■= 

86) 

prepare  to  wri te  check 

DSAI  ( 

=  Kuop  = 

87) 

set  access  iTioperative 

DCTA  ( 

=  K^p  = 

88) 

prepare  to  verify  (track  WT.th  addresses) 

DVHA  ( 

=  Kjjp  = 

89) 

prevare  to  venfy  (home  address) 

Control  Orders t  -i.e. 
Instruction  Names  and  Numbers  for  K( 'Hype 

''tape) 

HNOP  ( 

=  Kuop  = 

AA) 

no  operation 

HEOS  ( 

=  Kjsp  = 

Al) 

end  of  secjuence 

HRLF  ( 

=  ICop  = 

A2) 

reserved  light  off 

HRLN  ( 

=  K.OP  = 

A3) 

reserved  light  on 

HCLN  ( 

=  K^op  = 

A5) 

check  light  on 

HSEL  ( 

=  K^op  = 

A6) 

select 

HSBR  ( 

=  K^op  = 

A7) 

select  for  backvard  reading 

HCCR  ( 

=  K^op  = 

28) 

change  cartridge  and  rewind 

HRWD  ( 

=  K^op  = 

3A) 

rewind 

HRUN  ( 

=  K^op  = 

31) 

rewind  and  unload  cartridge 

HERG  ( 

-  K^^op  = 

32) 

erase  long  gap 

HWTM  ( 

=  IV,op  = 

33) 

write  tape  mark 

HBSR  ( 

-  K^op  = 

3't) 

backspace 

HBSF  ( 

=  KwOp  - 

35) 

backspace  file 

HSKR  ( 

=  K^op  ' 

36) 

space 

HSKF  (. 

=  K^op 

37) 

space  file 

HCHC  (: 

-  K„op  = 

38) 

change  cartridge 

HUNL  (: 

=  K^op  = 

39) 

unload  cartridge 

HFPN  (: 

-  K^op  = 

'12) 

file  protect  on 

Chapter  41  ;  The  IBM  7094  1,11  541 


APPENDIX  4    IBM  7094  PC  INSTRUCTIONS  TO  PI0('7909) 


Appendix 

IBM  7n?i*  Pc   Instructions  to  Pio('7909) 

Pa  State 

Pc^t  rap^nab  1  e<y\ ,  B ,  C  ,  D ,  E  .  F ,  G ,  H> 

An  8  bit  register  in  Pc  which  is  used  to  mask  or  allcw  trav 
requests  fror"  Pio.     OfA^Pt .  •  .M) 

InstTUOtion  Set 
The  following  instructions  in  Pa  are  used  to 

overate  on  each  Pio  state:  thus,  each  instruction  is  actually  8  instructions. 

RSC  ^(Wait  ->  (CC  ^e;  Wait  ^  O) ; 

reset  and  start  channel 

HtJait  ^RSC);^ 

initializes  a  Pio 

STC  -  (Wait  -  (CC  -  AC;  Wait       O) ; 

start  the  Pio  program 

-Wai  t  ->  STC) ; 

SCH  ->  (M[e]<21:35>  <- CC  ;  M[e]<3:17>  ^AC): 

store  channel.    Checks  status  of  a  Pio. 

ENB       (Pcarap^nable  «- M[e  ]<28 :  35>) : 

enable  from  effective  address 

RIC  ^  (CTCoACoARaCCnWCDWait  <-0); 

reset  channel 

TCO       H  Wai  t       )C  «-  e) ; 

transfer  on  channel  in  operation 

TCN  ->  (Wait  -*  IC  <-  e)  ; 

transfer  on  channel  not  in  operation 

Section  2 


The  SDS  910-9300  series, 
a  planned  family 

The  Scientific  Data  System  900-9000  series  consists  of  the  SDS 
910,  920,  925,  930,  940,  945,  and  9300  computers.  The  series 
includes  capabilities  and  features  found  in  most  24-bit  ma- 
chines. The  design  implementation  is  among  the  best  for  24-bit 
machines,  as  measured  by  equipment  utilization,  the  processor 
state,  implementation  technology,  and  ease  of  use. 

The  first  delivery  dates  for  the  members  of  the  series  are  910 
(August,  1962),  920  (September,  1962),  925  (February,  1965), 
930  (June,  1954),  940  (April,  1966),  945  (-1968),  and  9300 
(December,  1964). 

The  910  and  920  were  designed  at  the  same  time  as  a 
planned  series  of  compatible  computers  which  spanned  a  range 
of  performance.  The  910  has  instructions  which  facilitate  de- 
fining 920  instructions  by  software.  For  example,  these  include 
the  multiply  and  divide  step'  (see  page  544)  instructions  in 
the  910  for  programming  the  multiply  and  divide  instruction 
in  the  920. 

The  I/O  facility  evolved  to  a  clean  structure,  with  the  poten- 
tial for  having  a  high  degree  of  T  and  Ms  data-transfer  concur- 
rency at  a  comparatively  low  cost.  The  IBM  7094  should  be 
studied  for  a  contrasting  (more  expensive)  approach. 

The  instructions  which  help  manipulate  floating-point  data 
are  interesting  and  useful.  The  machine's  ability  to  execute 
closed  floating-point  arithmetic  subroutines  is  fairly  good  con- 
sidering that  the  instructions  are  not  hardwired. 

The  Programmed  Operator  (POP)  instructions  provide  the 
ability  to  define  an  instruction  set  for  efficient  encoding.  The 
idea  appeared  earlier  in  Atlas.  However,  the  POP  instruction 
calls  subprograms  in  primary  memory,  instead  of  in  fixed 
memory  like  Atlas. 

A  nice  scheme'  is  described  for  increasing  the  memory 
address  space  from  16,384  to  32,768  words.  Other  schemes 
which  switch  memory  banks,  like  those  in  the  PDP-8  (Chap.  5) 

'We  believe  this  appeared  originally  in  the  DEC  PDP-1  introduced  in  November. 
1960. 


and  in  the  65,384-word  7094  II  (Chap.  41),  tend  to  be  less 
desirable  and  flexible. 

The  SDS  930  was  used  at  the  University  of  California  (Berk- 
eley) as  the  base  machine  for  the  design  of  the  Berkeley  Time 
Sharing  System  (Chap.  24).  SDS  later  marketed  the  system  as 
the  SDS  940. 

The  9300  was  not  a  member  of  the  original  910-930  series. 
There  is  almost  symbolic  language  program  compatibility.  Sev- 
eral registers  and  extra  memory  transfer  paths  were  added  to 
form  the  9300  from  the  930.  The  power  of  the  9300  is  only 
a  factor  of  2  times  the  930  for  simple  instructions.  However, 
the  hardwired  floating-point  instructions  in  the  9300  increases 
the  power  over  the  930  by  a  factor  of  almost  10  for  arithmetic 
problems.  It  is  hard  to  believe  that  the  incompatible  9300  was 
a  wise  choice.  (We  suggest  a  more  reasonable  alternative  could 
have  been  a  two-processor  930'.  The  930'  processor  would  be 
a  930  but  with  hardwired  floating-point  arithmetic  instructions.) 
The  9300  has  interesting  twin-mode  instructions  for  simulta- 
neously operating  on  12-bit  data  pairs.  The  24-bit  fixed-point 
word  is  sufficient  for  the  real-time  applications  for  which  the 
computer  was  designed. 

A  flaw  in  the  series  is  the  sharing  of  K's  among  peripheral 
T's  and  Ms's.  This  problem  can  be  seen  by  looking  at  the  PMS 
structure  (Chap.  42.  Fig.  2,  page  546).  The  connection  to  the 
peripheral  K  from  K('Channel)  requires  a  continuous  connection 
during  the  data-transfer  dialogue  to  Mp.  This  structure  is  espe- 
cially bad  in  the  case  of  a  slow  T,  for  example,  a  typewriter. 
A  single  character  transmission  requires  that  K('W,  'Y)  be 
assigned  to  the  typewriter  during  the  complete  message  trans- 
mission (at  a  connected  time  of  100  milliseconds/character). 
The  problem  can  be  avoided  by  placing  a  character  memory 
in  each  slow  KT.  Multiple  devices  could  then  run  concurrently 
without  requiring  the  elaborate  K('W,  'Y)  to  be  attached  to  them. 
The  structure  does  not  preclude  such  an  improvement. 

A  complete  description  of  the  input/output  and  interrupt 
system  is  given  and  should  be  read  carefully. 


542 


Chapter  42 


The  SDS^  910-9300  series 

Introduction 

The  SDS  910,  920,  925,  and  930  form  a  compatible  series  of 
computers.  The  9300,  though  not  compatible  with  the  series,  was 
an  outgrowth  of  it.  The  9300  uses  the  Ms  and  T  devices  of  the 
930.  The  940  was  designed  initially  at  the  University  of  California, 
Berkeley  (see  Chap.  24)  for  time  sharing,  and  the  945  is  a  successor 
to  the  940.  The  word  length  is  24  bits,  and  one  single  address 
instniction  is  encoded  per  word.  The  state  of  the  machine  consists 
of  Mp(2048  —  .32768  w)  and  Mps('P/Program  Counter,  'A/Accu- 
mulator, 'B/Extended  Accumulator,  'X/Index  register). 

These  computers  have  been  designed  to  process  data  originat- 
ing from  physical  processes  in  real  time.  This  design  goal  leads 
to  a  priority  interrupt  system  with  many  (1,024)  levels.  The  multi- 
ple interrupts  facilitate  programming  and  decrease  the  interrupt 
response  time.  A  24-bit  word  or  two  12-bit  words  are  a  reasonable 
size  for  the  problem  types  encountered.  A  multiple  of  6  bits  was 
chosen  because  of  the  (then)  standard  6-bit  magnetic-tape  charac- 
ter. The  relatively  efficient  storage  representation  and  processing 
of  floating-point  data  allow  these  computers  to  be  used  for  gen- 
eral-purpose computation.  However,  only  the  9.300  has  built-in 
floating-point  operations.  The  9300  has  extensive  capability  for 
more  general-purpose  use.  It  is  also  used  for  operations  on  half- 
length  data. 

The  data  types  processed  by  the  910-9.30  include  words,  inte- 
gers, addresses,  and  boolean  vectors.  Several  special  instructions 
aid  processing  of  types  floating-point  and  double-length  integers. 
The  9.300  processes  the  additional  data-types  single-  and  double- 
length  floating  point.  The  9.300  has  twin-mode  instructions  which 
operate  on  two  half-length  data  ( 12  b)  simultaneously.  The  two's 
complement  representation  is  used  for  negative  numbers. 

The  multiph',  divide,  and  several  other  instructions  are  not 
wired  into  the  910,  and  compatibility  between  the  910  and  920-9.30 
cannot  be  completely  obtained  by  programming,  although  the  910 
is  a  subset  of  the  920-930.  Likewise,  a  smaller  minimum  Mp  is 
available  on  the  910  (2,048  word  versus  4,096  word).  The  920  and 
9.30  have  identical  instruction  sets  and  differ  in  memory  and  logic 
performance.  The  930  has  a  t. cycle:  1.75  jus,  and  the  910-920  has 
t. cycle:  S  ;us.  The  more  elaborate  PMS  stnicture  of  the  9.30  al- 
lows for  greater  growth,  (e.g.,  by  having  more  access  ports  to  Mp). 

'Scientific  Data  Systems  merged  with  Xerox  Corporation  in  1969.  The 
divisional  name  became  Xerox  Data  Systems  (XDS). 


The  9.300's  instruction  set  is  different  from  the  9.30's.  There  are 
three  index  registers.  The  PMS  structure  is  similar  (and  nearly 
compatible)  with  the  9.30.  There  are  more  (and  better)  working 
registers  in  the  9.300  Pc  to  increase  performance.  The  9.300  has 
two  memory-access  links,  and  the  Pc  can  fetch  instructions  and 
data  simultaneously.  The  instructions  in  the  various  C's  appear 
in  Table  1  for  comparison  purposes. 

The  SDS  925,  a  1.7.5-jus  version  of  the  SDS  910.  was  available 
only  for  a  brief  time  and  will  not  be  discussed  further. 

The  machines  process  instructions  (operations  to  the  accumu- 
lator) in  the  following  times  (microseconds): 


Itistnictiott 

.920 

930 

,9.300 

Fixed-Point  Add 

16 

16 

3.5 

1.75 

Fixed-Point  Multiply 

248 

32 

7.0 

7.0 

Floating-Point  Add 

896 

384 

92 

14.0 

Floating-Point  Multiply 

1696 

656 

147 

12.25 

Structure 

The  structure  of  these  computers  is  given  with  PMS  and  conven- 
tional diagrams  in  Figs.  1  to  4. 

The  SDS  channel  is  a  Kio('Channel)  and  not  a  Pio,  since  it  has 
no  program  counter  and  uses  Pc.  However,  it  can  be  as  effective 
as  a  Pio.  Of  course,  the  cost  is  lower  since  Pc  is  shared.  If  K('\\', 
'Y)  requires  memory  accesses,  they  must  wait  until  suitable  times 
in  the  Pc  instmction-interpretation  process  to  communicate  with 
memory  (Fig.  It. 

The  PMS  stmctural  detail  (Fig.  2i  does  not  show  the  algorithm 
bv  which  simultaneous  Kio('W,  'Y,  'C,  'D)  and  Pc  requests  for  Mp 
are  resolved.  K  has  the  highest  priority,  and  further  resolution 
among  K"s  is  determined  by  the  K  with  the  fullest  buffer  memory. 
Thus  the  priority  is  variable. 

There  are  three  basic  K  types,  or  channels  (Fig.  2),  in  the  9.30 
and  9300: 

1  K('Tinie  Multiplexed  Commvuiications  Channel  TMCC) 

2  K('Direct  Access  Communications  Channel,  DACC) 

3  Ki'Data  Subchannel  DSC) 


544  Part  6     Computer  families 


Section  2  [  The  SDS  910-9300  series,  a  planned  family 


Table  1    SDS  910,  920,  930  and  9300  instruction  sets^ 


Mnemonic 


Name 


Miwmonic 


Name 


LOAD/STORE 


REGISTER  CHANGE 


+  LDA 

M, 

T 

Load  A 

+  STA 

M. 

T 

Store  A 

+  LDB 

M, 

T 

Load  B 

+  STB 

M, 

T 

Store  B 

LDP 

M, 

T 

Load  Double  Precision 

STD 

M. 

T 

Store  Double  Precision 

LDS 

M, 

T 

Load  Selective  (Masked) 

STS 

M, 

T 

Store  Selective  (Masked) 

+  LDX 

M, 

T 

Load  Index  X 

+  STX 

M, 

T 

Store  Index  X 

+  EAX 

M, 

T 

Copy  Effective  address  into 

ister  1 

STZ 

M 

Store  Zero 

+  XMA 

M, 

T 

Exchange  M  and  A 

XMB 

M, 

T 

Exchange  M  and  B 

XMX 

M, 

T 

Exchange  M  and 

Index  Register 

ARITHMETIC 

+  ADD 

M, 

T 

Add  M  to  A 

DPA 

M, 

T 

Double  Precision  Add 

+  SUB 

M, 

T 

Subtract  M  from  A 

DPS 

M, 

T 

Double  Precision  Subtract 

MPO 

M, 

T 

M  Plus  One 

°MIN 

M, 

T 

M  Increment  (M  +  1) 

MPT 

M, 

T 

M  Plus  Two 

t  +  ADM 

M, 

T 

Add  to  Memory 

t  +  MUL 

M, 

T 

Multiply 

t  +  DIV 

M, 

T 

Divide 

TMU 

M, 

T 

Twin  Multiply 

DPN 

M, 

T 

Double  Precision  Negate 

XMUS 

M. 

T 

Multiply  Step 

xDIS 

M, 

T 

Divide  Step 

t°SUC 

M, 

T 

Subtract  with  Carry 

t-ADC 

M, 

T 

Add  with  Carry* 

X°MDE 

M, 

T 

M  Decrement 

ARITHMETIC,  FLOATING-POINT  (OPTIONAL) 


FLA 
FLS 
FLM 
FLD 

LOGICAL 

+  ETR 
+  MRG 


M,  T 
M,  T 
M,  T 
M,  T 


Floating  Add 
Floating  Subtract 
Floating  Multiply 
Floating  Divide 


Extract 
Merge 


RCH 

M,  T 

Register  Change 

AXB 

M,  T 

Address  to  Index  Base 

t°CLA 

Clear  A* 

t"CLB 

Clear  B* 

°CLR 

Clear  AB 

t°CAB 

Copy  A  into  B* 

-ABC 

Copy  A  into  B,  Clear  A 

t"CBA 

Copy  B  into  A* 

"BAG 

Copy  B  into  A,  Clear  B 

"XAB 

Exchange  A  and  B 

t  =  CBX 

Copy  B  into  Index* 

t«CXB 

Copy  Index  into  B* 

t°XXB 

Exchange  Index  and  B* 

t'STE 

Store  Exponent* 

t°LDE 

Load  Exponent* 

t-XEE 

Exchange  Exponents* 

t°CXA 

Copy  Index  into  A* 

t°CAX 

Copy  A  into  Index* 

"XXA 

Exchange  Index  and  A* 

Copy  Negative  into  A* 

°CLX 

Clear  X 

COPY 

Copy 

BRANCH 

+  BRU 

M,  T 

Branch  Unconditionally 

+  BRX 

M,  T 

Increase  Index  and  Branch 

+  BRM 

M,  T 

Mark  Place  and  Branch 

BRC 

M,  T 

Branch  and  Clear  Interrupt 

BMA 

M,  T 

Branch  and  Mark  Place  or  Argument 

Address 

+  BRR 

M,  T 

Return  Branch 

1  Lb  1 /oKIr 

t  +  SKE 

M,  T 

Skip  if  A  Equals  M 

+  SKG 

M,  T 

Skip  if  A  Greater  than  M 

SKL 

M,  T 

Skip  if  A  Less  than  M 

+  SKM 

M,  T 

Skip  if  A  equals  M  on  B  Mask 

SKU 

M,  T 

Skip  if  A  Unequal  M 

SKQ 

M,  T 

Skip  if  Masked  Quantity  in  A  Greater 

than  M 

SKF 

M,  T 

Skip  if  Floating  Exponent  in  B  is  Greater 

than  or  Equal 

+  SKA 

M,  T 

Skip  if  A  and  M  do  not  Compare  Ones 

Anywhere 

t  +  SKB 

M,  T 

Skip  if  B  and  M  do  Compare  Ones 

Anywhere 

-f  SKN 

M,  T 

Skip  if  M  is  Negative 

t  +  SKR 

M,  T 

Reduce  M,  Skip  if  Negative 

Chapter  42  |  The  SDS  910-9300  series  545 


Table  1    SDS  910,  920,  930  and  9300  instruction  sets  (Continued) 


Mnemonic 


Name 


Mnemonic 


Name 


+  EOR  M,  T 
REGISTER  SHIFT 


Exclusive  OR 


SHIFT 

M,  T 

Shift 

ARSA 

N,  T 

Arithmetic  Right  Shift  A 

ARSB 

N,  T 

Arithmetic  Right  Shift  B 

•RSH, 

Arithmetic  Right  Shift  AB 

ARSD 

N.  T; 

Arithmetic  Right  Shift  Double 

ARST 

N,  T 

Arithmetic  Right  Shift  Twin 

LRSA 

N,  T 

Logical  Right  Shift  A 

LRSB 

N,  T 

Logical  Right  Shift  B 

"LRSH 

(930  only), 

Logical  Right  Shift  AB 

LRSD 

N,  T; 

Logical  Right  Shift  Double 

LRST 

N,  T 

Logical  Right  Shift  Twin 

CRSA 

N.  T 

Circular  Right  Shift  A 

CRSB 

N,  T 

Circular  Right  Shift  B 

"RCY, 

Circular  Right  Shift  AB 

CRSD 

N,  T; 

Circular  Right  Shift  Double 

CRST 

N,  T 

Circular  Right  Shift  Twin 

°LSH: 

Arithmetic  Left  Shift  AB 

ALSA 

N.  T 

Arithmetic  Left  Shift  A 

ALSB 

N,  T 

Arithmetic  Left  Shift  B 

ALSD 

N,  T 

Arithmetic  Left  Shift  Double 

ALST 

N,  T 

Arithmetic  Left  Shift  Twin 

LISA 

N,  T 

Logical  Left  Shift  A 

LLSB 

N,  T 

Logical  Left  Shift  B 

LLSD 

N,  T 

Logical  Left  Shift  Double 

LLST 

N,  T 

Logical  Left  Shift  Twin 

CLSA 

N,  T 

Circular  Left  Shift  A 

CLSB 

N,  T 

Circular  Left  Shift  B 

°LCY, 

Circular  Left  Shift  AB 

CLSD 

N,  T: 

Circular  Left  Shift  Double 

CLST 

N,  T 

Circular  Left  Shift  Twin 

NORA 

N.  T 

Normalize  A 

"NOD 

N,  T 

Normalize;  Decrement  X 

NORD 

N,  T; 

Normalize  Double 

CONTROL 

+  HLT 
+  NOP 
+  EXU 
INR 
REP 


M,  T 
M,  T 
M,  T 
M,  T 


Halt 

No  Operation 
Execute 
Interpret 
Repeat 


SKP 
+  SKS 
-SKD 


M,  T 
M,  T 
M.  T 


FLAG  REGISTER 

FRTS  M 

FLAG  M 

FIRS  M 

FSTR  M 

FRST  M 

SWT  M 

INTERRUPTS 

+  EIR 
+  DIR 
+  EIT 
+  IDT 
+  AIR 


Skip  if  Bit  Sum  Even 
Skip  if  Signal  Not  Set 
Difference  Exponents:  Skip* 


Flag  Indicator  Reset  Test  Set 
Flag 

Flag  Indicator  Reset  Set 
Flag  Indicator  Set  Test  Reset 
Flag  Indicator  Reset  Set  Test 
SENSE  Switch  Test 


Enable  Interrupts 
Disable  Interrupts 
Interrupt  Enabled  Test 
Interrupt  Disabled  Test 
Arm  Interrupts 


MEMORY  EXTENSION  (930  ONLY) 

■>  Set  Extension  Register 

<■  Extension  Register  Test 

BREAKPOINT  TESTS  (SENSE  SWITCHES  IN  9300) 


»BPT 
-BPT 
»BPT 
-BPT 


OVERFLOW  (FLAG  IN  9300) 

°ROV 
-REO 
°OVT 


Breakpoint  No.  4  Test 
Breakpoint  No.  3  Test 
Breakpoint  No.  2  Test 
Breakpoint  No.  1  Test 


Reset  Overflow 

Record  Exponent  Overflow 

Overflow  Test;  Reset 


PROGRAMMED  OPERATORS 

»POP         M,  T  Programmed  Operator  (64  instructions) 


'IVI  Memory  or  Memory  Address;  N  number  of  shifts;  T-tag  field;  +  also  in  the  910.  920  and  930;  x  910  only;  .-not  in  the  9300;  J-not  in  the  910. 


546  Part  6  |  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


I 

Mp=  Pc'  5('ln-0ut  Bus)- 


-K('W)  Sfx-i-  K— T 


I— K 


-K('Y)  Sfx-i— K— T 


I — K— Ms 


—  ifprogramied  control  by  EOM^ 
[SKS,  FIN,  POT  oormands  ^ 


'Pc(l   address/instruction:   1    i  ns  t  ruct  ion/w;  211  b/w 
technology:  tranststor;   1962  ~  1968) 

=  Mp(core;   8  „>5/w;   20'|8  ~  16384  w;  2U  b/w) 


Fig.  1.  SDS  910  and  920  PMS  diagram. 


The  links  between  KT  or  KMs  and  anv  one  of  K('TMCC), 
K('D.\CC).  and  K('DSC)  are  identical.  The  KT  or  KMs  assembles/ 
disassembles  characters  into/from  words  and  transmits/receives 
them  to/from  the  Kio('Channel).  The  channel  communicates  with 
Mp  or  Pc  for  data  transmission  and  finally  communicates  with  Pc 
at  task  completion  (the  block  of  data  transferred).  Task  alarms  may 
cause  Kio  to  interiupt  Pc.  Each  Kio('Channel)  can  assemble  data 
on  a  6-,  12-,  or  24-bit  basis  for  Mp  accesses.  A  K('Channel)  recog- 
nizes two  types  of  information:  data  being  transmitted  between 
Mp  and  the  peripheral  K,  and  initialization  or  controlling  infor- 
mation from  Pc. 

In  the  930  or  9300  K's  the  principal  distinction  is  that  the  actual 
data-path  switching  routes  differ.  From  a  program  operation  and 
control  viewpoint  the  Time  Multiplexed  and  the  Direct  Access 
Communication  Channels  (TMCC  and  DACC)  and  the  Data 
Subchannels  (DSC)  behave  almost  identically.  The  TMCC  and 
DSC  differ  from  D,\CC  in  that  the  block  control  information 
(number  of  words  and  location  in  memory)  for  the  channel  mav 
be  either  in  primary  memory  or  in  local  hardware  memory  associ- 
ated with  the  channel  hardware. 


Mp(#0:3)=! — S?, 


-Ki/'Vi.Y.C.D)" — S- 


typewri  ter .Teletype ; 
card, paper  tape,  line, 
d  i  sdI ay 

spnagnetic  tape, disk,  "1 
|_drum,Magpak  ,  pape  r  tape 


L(l/0  bus;  under  Fa  vrogrammed  control) 
=  K(#E,F,r;,H)l_5  .  KX^— 


I  S(  'DMS):^-!^K(  'DSC) 


—  Lf'Memory   Interface  Connection/MIC)- 


'  Pc  (laddress/i  nst  ruct  ion  ;   1   i  ns  t  ruct  i on/w ;  Mps  (       w)  :  2^4  b/w;  technology  :  t  rans  i s tors) 
=  Mp(core;   l./Sp-^/w;  1|  ~  8  kw   (24,  I  parity)  b/w) 
»S(  concurrency:  1  ;  1.75m5/w) 

"KCTime  Multiplexed  Communications  Channel /TMCC ) 
^K('Direct  Access  Communications  Channe 1 /DACC ) 
"  K('Data  Subchanne 1 /DSC) 
''S('Oata  Multiploxsr  Svstem/OMf) 
='X   :=  t|Ms 


 control,  data 

 data  only 


Fig.  2.  SDS  930  PIVIS  diagram. 


Chapter  42     The  SDS  910  9300  series  547 


C  EOM 


Paral lel 
Input/Output 


Additional 
Optiona  I 
Memories 


PIN 


<  POT 


Priority 
Interrupts 


Main  Frame 


■First  Path 


TMCC 
W 


TMCC 
Y 


TMCC 
C 


(Optiona  I ) 
Memory 


Multiple  Access 
to  Memory 
Feature 


Memory 


TMCC 
D 


Multiple  Access 
to  Memory 
Feature 


■  Second  Path 


(  (  r-— I  Contll|~f 


Data 


MIC 

MIC 

DACC 

DACC 

DACC 

DACC 

DMC 

E 

F 

G 

H 

Multiplexing  System 


Optional 
EIN 


Priority 
Control 


DSC 

DSC 

DSC 

Priority  Interrupts 


Whe  re 


DSC= 
MIC  = 
TMCC= 
DACC= 


Data  Sub-Channel 

Memory   Interface  Connection 

Time  Multiplexed  Communications  Channel 

D 1 rect  Access 


Fig.  3.  SDS  390  computer-configuration  diagram.  (Courtesy  of  Scientific  Data  Systems.) 


548  Part  6  |  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


The  9300  structure,  though  not  given  m  the  PMS  diagram,  is 
essentially  that  of  the  930  (Figs.  3  and  4).  In  the  9300,  Mp  has 
three  access  ports  or  a  S('Memory-Processor;  8  Mp;  3  P,K).  The 
Pc('9300)  requires  two  of  the  access  ports  for  independent  access 
of  instructions  and  data,  leaving  one  for  K  transfer  to  Ms  and  T. 


Instruction-set  processor 

The  interesting  parts  of  the  ISP  are  discussed  informally  below. 
The  formal  ISP  description  given  in  Appendix  1  of  this  chapter 
should  be  read.  The  descriptions  are  partially  taken  from  the  SDS 
Programming  Reference  Manuals. 


Single-bit  I/O 
Control  and  Sense 
 H 


24-bit  I/O 


1 


Up  to  1024  Priority 
Interrupts 


Instruction/operand  access 
is  overlapped  when  separate 
memory  modules  are  accessed 


Core  Memory 
Expandable  to  32,  768  words 


SfPlS  9300 
COMPUTER 

Arithmetic 
and  Control 


Input/Output  Control 


Instructions 


Operands  and 
Time -Multiplexed 
I/O 


)      )  \^ ) 

^  ^  V.^   ^ 

Time-Multiplexed  Communication  Channels 
(Up  to  30  devices/channel) 


24-bit 
Word  Parallel 
I/O 


Basic  4096-word  Memory 


=1  1 

^    Optional  4096-word  Memory  I 


Optional  8192-word  Memory  | 


I  ' 
=1  I 

►I  Optional  16,  384-word  Memory  I 

H  I 

I  I 

L  I 


■~1 


]     Multiple  Access  [  [     Data  Multiplex  ! 


A   A  >■ 


N  A 

(  M  I  F  )  (  G  ;  H  ) 
v.^         V.^         V.^'   ' 

Direct  Access  Communication  Channels 
(Up  to  30  devices/channel) 


I  to  Memory  ^ 
L   J 


I  System  j 
L  I 


To/from  Special  Devices        |  i.„  „,  t„j  r  i 

'  ^  Memory  Interface  j 

I  Connections*  | 


Up  to  128  Data 
Subchannels 


I  


Note:    Broken  lines  indicate  optional  hardware. 


Fig.  4.  SDS  9300  computer-configuration  diagram.  (Courtesy  of  Scientific  Data  Systems.) 


Chapter  42     The  SDS  910-9300  series 


Registers  and  memory  (930) 

The  Pc  state  is  declared  in  the  ISP  description.  The  ISP  registers 
are  A,  B,  X,  P,  M,  and  miscellaneous  bits  for  overflow,  carry,  etc. 
Overflow  can  be  turned  on  for  arithmetic  overflow  in  addition, 
subtraction,  multiplication,  division,  and  left-shift  instructions. 

Data  formats 

General.  A  computer  word,  W,  is  24  binary  digits  (bits)  or  8  octal 
digits,  A  word  is  numbered  W<():2.3)  from  left  to  right  or  alterna- 
tively W<():7>8. 

Fixed-point  data  format.  Fi,xed-point  numbers  are  represented  in 
two's  complement  form  with  the  sign  at  W<()).  .\  2.3-bit  fraction 
W<1:23)  can  be  assumed.  The  binary  point  is  to  the  left  of  bit 
position  1  (W<1>).  For  integers,  the  binary  point  is  to  the  right 
of  \V<23>. 

Floating-point  data  format.  Subroutines  perform  double-  and  sin- 
gle-precision floating-point  arithmetic.  A  floating-point  word  is 
defined  as  f<0:47>  :  =  W[n:(n -|-  1)]<():23>.  Of  course,  single- 
precision  floating  point  requires  less  processing  time. 

The  fractional  portion  (mantissa),  f<0:38),  of  a  double-precision 
floating-point  number  is  a  39-bit  proper  fraction  with  the  leading 
bit  being  the  sign  bit  and  the  binarv  point  located  to  the  left  of 
the  most  significant  magnitude  bit,  f<l)- 

The  floating-point  e.xponent  is  a  9-hit  integer,  f< .39:47),  with 
the  leading  bit  being  the  sign,  f<;.39>.  The  standard  routines  operate 
on  both  fraction  and  exponent  in  two's  complement  form.  If  F 
represents  the  contents  of  the  fractional  field  and  E  represents  the 
contents  of  the  exponent  field,  the  number  has  the  form  F  x  2^. 

Standard  subroutines  assume  that  the  more  significant  word  is 
in  the  A  register  and  that  the  less  significant  word  is  in  the  B 
register.  Correspondinglv  for  Mp,  the  more  significant  word  is  in 
Mp[x]  and  the  least  significant  word  in  Mp[x  -|-  1]. 

The  single-precision  floating-point  representation  is  identical 
to  that  of  double-precision  floating  point;  i.e.,  it  takes  two  words. 
However,  the  least  significant  bits  of  the  mantissa,  f<24:.38),  are 
not  processed;  thus  there  is  a  saving  in  time  but  not  in  space  for 
using  single  precision. 

Instruction  word  format  (930) 

The  computer  instruction  word  format  is  given  in  Fig.  5. 

W<0)  is  the  Relative  Address  bit,  R.  Standard  software  loading 
programs  use  this  bit;  central  processor  decoding  logic  does  not 
use  or  sense  this  bit.  A  1  in  W<0>  causes  some  loading  programs 


R  j  X  1  Instruction  code 

\  1  Address  field 
1       1          1          1  1 

Bit 

0  1  2 '           '  6 

9   10   1            1  ' 

1  23 

Octal    0  1  2  3  4  5  6  7 

digit 


Fig.  5.  SDS  930  instruction-format  diagram. 

to  add  the  assigned  location  of  the  instruction  to  the  address  field 
contents  prior  to  actual  storage  into  the  assigned  location. 

is  the  Index  Register  bit,  X.  It  determines  whether  or 
not  the  index  register  will  be  added  to  calculate  the  effective 
address. 

\\X2:8)  is  the  Instruction  Code  field  and  determines  the  oper- 
ation to  be  performed.  The  Programmed  Operator  facility  is 
selected  by  \V<2>;  it  is  part  of  the  Tag  field  W<0:2>. 

\V{9)  is  the  Indirect  .\ddress  bit,  I.  It  determines  whether  or 
not  e  or  M[e]  is  to  be  used  as  the  effective  address  (see  below). 

\V(  10:23)  is  the  .\ddress  field  and  for  most  instnictions  repre- 
sents the  location  of  the  operand  called  for  bv  the  instmction  code. 

Address  modification.  Index  and  indirect  addressing,  used  singly 
or  in  combination,  perform  address  modification  after  bringing  the 
instruction  from  memory  but  before  executing  it.  The  instmction 
remains  in  memory  in  its  original  form.  The  results  of  indexing 
and  or  indirect  addressing  form  the  "effective  address, '  e. 

i.vDEXiNG  If  the  content  of  the  index  bit  in  an  instruction  is  a 
1,  prior  to  execution  the  computer  adds  the  contents  X<  10:23), 
of  the  index  register  to  the  contents  of  the  address  field  of  the 
instmction.  This  addition  does  not  keep  any  overflow  or  carry 
bevond  the  fourteenth  address  bit.  This  addition  occurs  prior  to 
anv  indirect  action. 

INDIRECT  ADDRESSING  A  1  in  the  indirect  address  bit  causes  the 
computer  to  decode  the  contents  of  the  effective  address,  accessed 
as  described  above,  as  if  it  were  an  instmction  without  an  instmc- 
tion code;  that  is,  the  address  logic  reinitiates  address  decoding, 
using  the  word  in  the  effective  location  (the  memory  cell  whose 
address  is  the  effective  address).  This  is  an  iterative  process  and 
provides  multilevel  indirect  and  indexed  addressing.  Each  le\el 
of  indirect  addressing  adds  an  additional  cvcle  time  to  the  in- 
stmction e.xecution  time. 

930  memorij  extension  control  registers.  Core  memory  in  the  930 
is  expandable  to  32,768  words.  However,  the  address  field  in  the 


Part  6  I  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


instruction  format  is  14  bits  long,  allowing  direct  access  of  only 
up  to  16,384  words.  Memory  extension  in  the  930  contains  two 
3-bit  memory  extension  registers,  EM2  and  EM3,  and  allows 
addressing  of  memories  of  32,768  words.  The  program  loads  either 
or  both  of  the  registers  and  activates  them  as  desired.  Each  register 
can  become  the  most  significant  digit  (fifth  octal)  of  any  operand 
address. 

The  program  uses  the  first  extension  register,  EM3,  by  calling 
for  an  address  with  an  11,  in  the  most  and  next  most  significant 
address  bits,  respectively  (a  3  for  the  most  significant  octal  digit). 
The  program  calls  for  EM2,  the  second  extension  register,  by 
setting  the  same  two  address  bits  to  10.,  (a  2  for  the  most  significant 
octal  digit).  In  this  way,  normal  addressing  compatible  with  the 
910  and  920  occurs  by  setting  a  3  in  EM3,  and  a  2  in  EM2. 

910-930  instructions 

Programmed  Operators  (POP's)  enable  subroutines  to  be  called 
with  a  single  instruction.  This  provides  definable  instructions  of 
the  same  form  as  built-in  machine  instructions.  The  computer 
decodes  the  operation  codes  lOOg  ~  177g  as  special  instructions 
and  transfers  to  a  subroutine  whose  address  is  uniquely  determined 
by  the  code.  The  computer  records  the  address  of  the  POP  in- 
struction at  location  0  together  with  an  indirect  address  bit  so 
that  the  program  continuity  may  be  maintained.  By  indirect 
addressing  which  refers  to  location  0,  which  in  turn  refers  to  the 
POP  instmction,  the  subroutine  can  gain  access  to  the  effective 
address  of  the  operand  associated  with  the  POP  instruction. 

The  instruction  set  for  the  computers  in  this  series  is  listed  in 
Table  1.  The  table  should  be  used  to  compare  the  machines. 

There  are  two  instructions  in  the  910  which  are  not  in  the  920 
or  930:  Multiply  Step  and  Divide  Step.  These  instructions  facilitate 
writing  subroutines  for  multiplication  and  division.  The  Multiply 
Step  (MUS)  instruction  is  defined; 

MUS     (B<23>      A^A  +  M[e];  next  AB  ^  AB/2); 
9300  instructions 

The  instruction  word  format  in  the  central  processor  is  shown  in 
Fig.  6. 


1  1  X 

Instruction  code 

Address  field  1 

1           1           1  1 

Bi-t 
Octal 
digit 

0  1 
0 

h — ^  ^ 

1  2 

9         '  ' 
3           4  5 

1 

6 

'  23' 
7 

Fig.  6.  SDS  9300  instruction-format  diagram. 


W<0)  contains  the  Indirect  Address  bit  I. 
W<1:2>  contains  the  Index  Register  bits  X<0:1>. 
W<0:2>  is  called  the  Tag  field. 

W<3:8)  contains  the  Instruction  code;  the  contents  of  this  field 
determine  the  operation  to  be  performed. 

W<9:23)  contains  the  Address;  for  most  instructions,  the  con- 
tents of  this  field  represent  the  memory  location  of  the  operand 
called  for  by  the  instruction  code. 

Address  modification.  Each  index  register  contains  an  unsigned 
base  address  of  1.5  magnitude  bits  and  a  signed  increment  of  9 
bits.  The  increment  contains  8  magnitude  bits  and  a  sign  bit  and 
is  held  in  two's  complement  form. 

Index  registers  are  modified  by  adding  the  signed-increment 
value  to  the  base  address  using  two's  complement  arithmetic.  Since 
the  increment  and  base  address  fields  are  of  unequal  lengths,  the 
sign  bit  (bit  0)  of  the  increment  field  is  extended  six  positions  to 
the  left  prior  to  the  addition.  This  15-bit  sum  is  then  stored  in 
the  base  address  field  of  the  index  register.  The  index  register  may 
be  incremented  by  anv  value  from  —  256j„  to  2.55j(|  using  a  single 
instruction.  Incrementing  and  testing  for  a  "terminal  condition" 
is  done  by  the  instruction  Increase  Index  And  Branch  (BRX),  as 
follows: 

If  the  index  register  has  been  negatively  incremented,  a  ter- 
minal condition  exists  when  the  base  address  has  been  reduced 
below  the  zero  value. 

If  the  index  register  has  been  positively  incremented,  a  terminal 
condition  exists  when  the  resultant  base  address  has  been  increased 
beyond  the  maximum  address  value  (077777^). 

If  the  terminal  condition  exists,  the  next  instruction  is  taken 
in  sequence.  If  the  terminal  condition  does  not  exist,  program 
control  is  transferred  to  the  location  specified. 

The  instniction  set  for  the  9300  is  given  in  Table  1. 

Pc  implementation 

All  the  processors  of  the  series  have  basically  similar  register 
configurations  because  of  the  common  Instniction-set  Processor. 
However,  the  increasing  complexities  of  the  machines  can  be  seen 
by  comparing  the  register  structures  of  the  910-930  (Fig.  7)  with 
the  9.300  (Fig.  8).  The  figures  show  both  the  registers  accessible 
to  the  program  or  defined  by  the  ISP  (denoted  by  °)  and  the 
temporary  registers  which  are  necessary  for  the  implementation. 

910,  920,  930  registers  {Fig.  7) 

ISP  registers  (°).  The  A  register  is  the  main  accumulator  of  the 
computer.  The  B  register  is  an  extension  of  the  A  register.  The 


Chapter  42     The  SDS  910-9300  series  551 


Core  memory  (24  b/w; 
2048 16384  w) 


Mp  (core) 


M' 

Memory  Buffer 


Console, 
Miscellaneous  bits 


C 

(buffer) 


 ,  L  


Memory  Address 


Program  Counter 


Index  register 


w 

(I/D  buffer) 


(I/O  buffer) 


(parallel  I/O) 


To  peripherol  T  ond  Ms 


All  registers  24  bits  except  S00:Z3>;O<3:e>.EM2<0:2> ,  ond  EM3<0  2> 
"  Registers  occessoble  to  program  (ISP) 
t  Only  in  930,930  core  memory  is  32768* 


Fig.  7.  SDS  910,  920,  and  930  registers  diagram. 

I 

B  register  contains  the  less  significant  portion  of  double-length 
numbers.  Overflow  and  carry  bits  are  used  with  .A  and  B  opera- 
tions. 

The  inde.x  register  X.  used  in  address  modification,  is  a  full-word 
register.  Index-register  operations  use  the  least  significant  14  bits. 

The  P  register  is  a  14-bit  register  that  contains  the  memory 
address  of  the  current  instruction.  Unless  modified  by  the  program, 
the  contents  of  P  increase  b\'  1  at  the  completion  of  each  instruc- 
tion. 

The  niemorv  extension  registers.  EM.3  and  EM2,  are  .3-bit 
registers  that  specif\  the  portion  of  extended  memory  being  used. 
They  exist  only  in  the  9.30. 

Hardware  registers  not  in  the  ISP.  The  S  register  is  a  14-bit  register 
that  contains  the  address  of  the  memory  location  to  be  accessed 
for  instructions  or  data.  The  1.5-bit  address  is  formed  by  S  and 
one  of  the  memory  extension  registers. 


The  24-bit  C  register  communicates  with  memory.  Instinctions 
are  temporarily  held  in  C  before  instruction  decoding.  It  is  used 
as  an  arithmetic  and  control  register  in  multiph',  divide,  and  other 
operations.  Address  modification  and  parity  generation  detection 
use  the  C  register. 

The  O  register  is  a  6-bit  register  that  contains  the  instruction 
or  operation  code  of  the  instniction  being  executed. 

The  M'  register  is  a  24-bit  register  that  holds  each  word  as  it 
comes  from  memory.  Recop\  ing  of  a  word  into  memor\'  takes  place 
from  the  M'  register. 

9300  registers  [Fig.  8) 

ISP  registers  (°).  The  .\  and  B  registers  of  the  9300  are  the  same 
as  in  the  9(K)  series  computers;  however,  the  P  register  is  P<9:23). 

There  are  three  24-bit  index  registers,  X[l:3].  Each  index  regis- 
ter is  composed  of  a  ha.se  address  of  1.5  bits  and  a  signed  increment 
of  9  bits. 

The  Flag  register,  F,  is  a  6-bit  register  that  may  be  set  and  or 
sensed  by  the  program.  The  first  bit  position  of  this  register  is  the 
overflow  indicator. 

Hardware  registers  not  in  the  ISP.  The  C  register  holds  the  24-bit 
operand  word  as  it  is  transmitted  to,  or  received  from,  memory. 

The  D  register  holds  the  next  24-bit  instniction  word  as  it  is 
received  from  memory. 

The  15-bit  S  register  contains  the  address  of  the  memor\  loca- 
tion to  be  accessed  for  either  instruction  or  operand. 

The  6-bit  O  register  contains  the  instruction  code  of  the  in- 
struction being  executed. 

The  A'  register  is  an  optional  15-bit  register  used  for  the 
floating-point  option.  It  temporarily  extends  the  .\  register  during 
the  execution  of  floating-point  instnictions. 

The  B'  register  is  an  optional  15-bit  register  which  temporarily 
extends  the  B  register  during  the  execution  of  floating-point  in- 
structions. 

Instruction  interpretation  in  the  900  series 

The  instniction-interpretation  process  can  be  explained  in  terms 
of  the  processor's  registers  (Fig.  Tl.  The  .-\DD  instruction  execution 
(not  including  memory  mapping)  defined  in  ISP  as  A  <—  A  -I-  M[e] 
is  interpreted  as 

S      P:  P  «— P  -I-  1;  next        fetch  the  in.itruction 
M'  *-  Memory[S];  ne.xt 
C  ^M';  ne.xt 


552  Part  6  |  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


i     M  Registers  I 

I  I 
I  I 


I      M  Registers  I 


t  ! 


FT 


C  (Operonds) 


Direct  Parallel  I/O 


Parity 
Generation 


Parity 
Check 


D  (Instructions) 


(Instruction) 


O  Register 


(Program 
I — Counter) 


Memory 
Control 


Porify  Check 


(Operand 
Address) 


ADDER 


ADDER  — 


(Accumulator) 


(Extended 
Accumulator' 


— 1  1  1  

F  (Flag) 

Console , 
Misc.  Bits 

Overfl 

ow 

XI  (Index) 

Incr. 

Base 

Incr. 


X2  (Index) 
Base 


X3  (Index) 

Incr. 

Base 

Note:  Only- 
registers  accessible  to  program 


Fig.  8.  SDS  9300  registers  diagram.  (Courtesy  of  Scientific  Data  Systems.) 


O  ^C<();5>;  next 
(0  =  ()5)^( 


S  .^C<10:23>;  next 
M'  «—  Meniory[S];  next 
C      M';  next 
A^A  +  C) 

Input/output  processing 
Introduction 


ADD  execution 

operand  effective-address-cal- 
culation process  (including 
indexing  and  indirect  ad- 
dressing 

final  operand  fetch 


add  operation 


There  are  several  methods  of  transferring  data  between  Mp  and 
the  K's.  These  methods  will  be  described  independently,  and  in 


order  of  increasing  complexity.  They  are: 

la    Single  bit  sent  to  a  selected  K  (EOM  instruction). 

\b    Single  bit  sense  (or  bit  detection)  from  a  K  (SKS  instruc- 
tion). 

2      Word  parallel  to/from  a  K  (POT/PIN  instruction). 

.3      Interrupt  from  one  of  1,024  K's  on  a  priority  basis  to  Pc. 
K  can  signal  Pc  to  execute  a  particular  program. 

4o    Time  Multiplexed  Communication  Channel/TMCC  (In- 
ternal Interlace'  feature). 

4b    Time  Multiplexed  Communication  Channel  (External 
Interlace"^). 

5      Direct  Access  Communication  Channel/DACC  (External 
Interlace). 

'The  control  information  for  the  location  of  the  next  word  transferred  and 

the  number  of  words  to  transfer  are  kept  in  Mp. 

^The  control  information  is  taken  from  registers  within  K. 


I 


Chapter  42     The  SDS  910-9300  series  553 


6a    Data  Subchannel/DSC  (Internal  Interlace). 

6b    Data  Subchannel  (External  Interlace). 

7      Memory  Interface  Connection/MIC  link.  A  component  has 
a  link  to  Mp. 

Methods  1  to  3  above  are  completely  under  control  of  a  pro- 
gram and  are  simple  time-independent  instmctions  (or  methods) 
of  transferring  data  to  K's  (and  onto  KT  or  KMs).  The  ISP  descrip- 
tion (.\ppendix  I  of  this  chapter)  has  a  detailed  description  of  the 
I/O  devices  and  these  I/O  instmctions. 

Single-bit  control  and  sense 

Two  instmctions  provide  for  single-bit  ON/OFF  control  signals. 
The  first,  EOM,  transmits  a  control  signal  and  a  14-bit  address 
to  an  external  device  or  a  function  within  the  computer.  The 
second,  SKS,  selects  an  external  device  or  computer  function  and 
skips  in  response  to  a  false  (0)  signal.  Up  to  16,384  control  signals 
can  be  sent  and  16,384  input  signals  tested  theoretically.  (A  more 
reasonable  number  of  physical  destinations  would  be  50.)  Execu- 
tion of  an  EOM  causes  a  signal  of  approximately  1.4  microseconds 
duration  to  be  transmitted. 

EOM  instruction  format.  EOM  is  used  to  select  a  specific  I  O 
device  by  placing  a  1  in  its  select  register.  EOM  requires  one  cycle. 
W<2>  =  0. 

W<0:1>  is  reserved  for  special  system  address  bits. 
W<3:8>  contains  the  EOM  instructions  code,  02. 
W<10:11>  contains  the  system  mode  specifier. 
W<12:23>  contains  the  12-bit  address  field  that  specifies  the 
special  .svstem  destinations. 

SKS  format.  The  SKS  instmction  format  has  each  corresponding 
bit  field  identical  to  the  system  EOM  format.  Execution  of  an  SKS 
causes  a  14-bit  address  to  be  presented  to  all  K's;  the  K  being 
addressed  responds  and  is  tested.  If  the  addressed  external  K 
supplies  a  "set"  signal  to  the  central  processor,  the  computer 
executes  the  next  instmction  in  sequence  from  the  SKS.  If  no  signal 
is  set,  the  computer  skips  the  next  instmction  in  sequence  and 
executes  the  following  instmction.  No  registers  are  affected  except 
the  P  register.  SKS  requires  two  or  three  Mp  cycles  if  no  skip  or 
skip,  respectively,  is  executed. 

Word  parallel  instructions 

Two  instmctions.  Parallel  Output  (POT)  and  Parallel  Input  (PIN), 
permit  any  word  in  Mp  to  be  presented  in  parallel  on  a  physical 
connector  to  a  K  or,  inversely,  permit  signals  sent  from  a  K  to 


be  stored  in  Mp.  The  execution  of  a  POT  or  PIN  instmction  sends 
a  signal  to  the  external  device  involved  in  the  input/output  oper- 
ation, which  notifies  the  device  to  send  its  data  word  as  soon  as 
it  is  operational.  When  the  device  becomes  operational  during  a 
Read  or  PIN  operation,  it  transmits  a  Ready  signal  to  the  central 
processor  while  at  the  same  time  presenting  a  data  word  to  Pc. 

During  the  e.xecution  of  a  POT  instmction,  the  central  proc- 
essor transmits  a  signal  to  the  external  device,  alerting  it  to  receive 
a  data  word.  When  the  device  becomes  operational,  it  transmits 
a  Ready  signal  to  the  central  processor,  which  releases  the  data 
word  to  the  external  device. 

Selective  input  output  with  these  devices  is  accomplished  by 
preceding  POT  or  PIN  with  an  EO.Vl  to  alert  (select)  the  desired 
device  by  a  specific  address.  By  preceding  the  POT  or  PIN  with 
an  SKS,  the  Read\'  signal  of  the  special  device  can  be  tested  after 
the  execution  of  the  EOM  but  prior  to  execution  of  the  parallel 
transfer  instruction;  a  possible  Pc  "hangup"  can  thus  be  avoided. 
The  Ready  signal  can  also  set  one  of  the  priority  intermpts. 

PIN  stores  the  contents  of  24  input  lines  in  parallel  in  the 
effective-memory  location.  PIN  or  POT  requires  four  cycles  plus 
any  waiting  time  for  Ready. 

Interrupt 

The  interrupt  provides  program  control  of  input,  output  opera- 
tions, aids  in  programming  simultaneous  input/output  and  com- 
pute operations,  and  allows  immediate  recognition  of  special 
external  conditions  by  causing  Pc  to  execute  an  instmction  in  a 
selected  Mp  location  at  the  end  of  the  e.xecution  cycle  of  the 
current  instmction.  Without  disturbing  the  program  register,  the 
processor  executes  an  instmction  in  one  of  a  selected  set  of  mem- 
ory locations.  .\  .Mark  Place  and  Branch  (BR.M)  instmction  in  this 
location  saves  the  contents  of  the  program  register,  EMS,  EM2, 
and  overflow  indicator  and  transfers  to  the  particular  intermpt 
servicing  routine  required.  To  exit  from  the  intermpt  service 
routine,  a  Branch  Unconditionally  (BRU)  instruction  using  indirect 
addressing  returns  control  to  the  ne.xt  instmction  in  proper  se- 
quence in  the  main  program;  it  also  clears  the  intermpt.  Processor 
state  (that  is,  .\,  B,  Overflow,  and  X)  must  be  preserved  and 
restored  by  the  program  if  the  registers  are  used  by  the  program. 

The  priority  intermpt  system  has  up  to  1,024  intermpts  ar- 
ranged in  levels.  The  levels  have  priority  according  to  a  prioritN' 
number;  the  higher  priority  levels  have  a  smaller  number.  Inter- 
mpt channels  are  installed  in  Pc  in  groups  of  16.  The  assignment 
of  physical  memory  locations  to  intermpt  levels  is  shown  in  ,\p- 
pendix  1  of  this  chapter;  the  assignment  is  in  order  of  decreasing 
priority  from  location  200^,  (highest)  to  1477^  (lowest).  Intermpt 
requests  can  also  be  programmed.  The  power  fail-safe  (for  power 


554  Part  6  j  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


supply  off)  interrupts  and  out-of-order  interrupts  have  the  highest 
priority. 

Besides  the  interrupt  mechanism  just  discussed,  there  is  also 
a  single  instruction  interrupt.  This  permits  the  execution  of  only 
one  instniction  before  automatically  being  cleared  and  returning 
to  the  program  that  was  interrupted.  For  example,  if  an  external 
clock  source  is  connected  to  the  computer  so  that  it  pulses  an 
interrupt  line  at  set  intervals,  the  program  can  maintain  a  pro- 
grammed real-time  clock.  Each  time  the  external  pulse  causes  an 
interrupt,  the  program  executes  the  single  instruction.  Memory 
Increment  (MIN),  to  add  1  to  the  memory  word  selection  for  use 
as  a  programmed  real-time  clock.  (The  main  program  can  examine 
this  memory  location  whenever  necessary  to  determine  how  many 
time  increments  have  elapsed  since  the  clock  was  started.) 

Interrupts  can  be  single  or  normal-instruction  interrupts  in  any 
combination  desired. 


An  interrupt  has  three  operational  states:  inactive,  waiting,  and 
active  states. 

In  the  inactive  state,  no  interrupt  signal  has  been  received  into 
the  level  and  none  is  currently  being  processed  by  its  interrupt 
servicing  subroutine. 

In  the  waiting  state,  an  interrupt  has  been  received  but  is  not 
being  processed.  This  situation  may  arise  when  an  interrupt  of 
higher  priority  is  being  processed.  When  all  higher  waiting  inter- 
rupts have  been  processed,  this  level  goes  to  the  active  state 

In  the  active  state,  the  interrupt  has  caused  the  main  program 
to  recognize  its  presence  and  has  transferred  to  its  assigned  inter 
rupt  location  where  it  is  being  processed. 

Two  program  control  features  are  Arm/Disarm  and  Enable/ 
Disable.  Arm/Disarm  controls  whether  an  interrupt  can  proceed 
from  the  inactive  state  to  the  waiting  state.  When  armed,  an 
internipt  signal  sets  the  interrupt  to  the  waiting  state.  Enable/ 


Channel  E 


Character  Input 


6-BIt  +  Parity 
12-,  24-bIt  optionar 


Pority 
Check 


to  KMs 
or 
KT 


Character  Output 


6-Bit  +  Parity 
12-,  24-bit  optional 

Up  to  30  I/O 


Parity 
Gen. 


De 


Control 


u 

W 

M 

A 

c 

A 

R 

R 

R 

i 

1 

1 

1 

1 

t 

1 
1 

Address 


Lines 


Control  Logic 


Data 


Lines 


Request 


Line 


Channel 
Control 
Unit 


Other 
Communication 
Channels 
(F,  G,  H) 


Addr 


Data 


SDS 
930 
Memory 
Modul  es 


To  Mp 

or  to 


Mp  via  Pc 
(for  Channels 
W,  Y,  C,  D) 


^'Part  of  interlace 


Fig.  9.  SDS  930  direct-access  communication-channel  register  diagram.  (Courtesy  of  Scientific  Data  Systems.) 


Chapter  42  |  The  SDS  910-9300  series  555 


Disable  operates  on  the  entire  interrupt  system.  (When  the  inter- 
rupt system  is  enabled,  interrupts  can  occur.) 

Communications  channels — Kio{'Channel) 's 

Kio('Communication  Channels)  provide  buffering,  input/output 
control,  and  data  transmission  simultaneously  with  computation. 
There  can  be  up  to  eight  independent  communication  channels 
and  a  large  number  of  subchannels  in  a  single  system.  Figure  9 
.shows  the  registers  in  a  K('Channel). 

Each  channel  can  control  up  to  .30  KT  s  or  KMs's.  The  channel 
handles  character,  word  assembly  and  disassembly,  input/output 
parity  detection  and  generation,  data  transmission  to  and  from 
memory,  and  end-of-transmission  detection. 

AW  channels  are  bidirectional  and  can  communicate  with  6-bit 
character  devices  or  word  devices  in  6,  12,  and  24  bits.  The  main 
program  that  initializes  a  K  specifies  the  number  of  characters  to 
be  contained  in  each  word  during  the  transmission. 

The  channel  interlace  controls  the  transfer  of  the  data  words 
going  through  the  associated  channel  buffer,  supplies  the  memory 
address  of  data  coming  from  or  going  to  memory,  and  maintains 
the  word  count  determining  the  number  of  words  transferred.  This 
interlace  information  can  be  either  in  K  hardware  (e,\ternal  inter- 
lace) or  in  Mp  (internal  interlace).  The  terminal  intermpts.  End 
of  Record  and  Zero  Word  Count,  come  from  the  interlace  and 
are  under  its  control. 

The  time-multiple.xed  channels  use  the  memory-access  logic  of 
Pc  to  transmit  input  and  output  of  data  words  and  require  two 
memory  cycles  (see  Fig.  2).  Each  direct-access  channel  has  inde- 
pendent memory-access  logic  and  requires  one  memory  cycle  (see 
Fig.  2). 

Conuiuinicution-channcl  description.  Up  to  .30  peripheral  devices 
(K's  for  T  or  Ms)  may  be  connected  to  one  K('Channel)  (Fig.  9). 
Each  device  has  a  unique,  2-digit,  octal  address  by  which  it  is 
selected  for  an  input/output  operation.  To  select  the  peripheral 
device,  the  program  loads  the  proper  unit  address  into  the  6-bit 
Unit  ,\ddress  Register  (UAR)  in  the  channel.  This  address  selects 
both  the  device  and,  if  appropriate,  the  fimction  to  be  performed. 
Placing  a  nonzero  unit  address  in  the  unit  address  register  connects 
the  peripheral  unit  addressed  to  the  channel,  and  the  unit  becomes 
active.  When  the  UAR  contains  a  zero  address,  or  any  time  that 
a  terminal  or  initial  condition  clears  the  contents  of  UAR,  the 
channel  becomes  inactive. 

The  24-bit  data  Word  .\ssembly  Register  (WAR)  contains  the 
data  word  actively  being  received  or  transmitted  during  an  input 
or  output  operation.  During  input,  6-bit  characters  (plus  parity) 


enter  the  Single-Character  Register  (SCR)  where  the  channel 
buffer  assembles  them,  one  at  a  time,  into  the  WAR. 

The  channel  interlace  contains  two  working  registers:  the  Word 
Count  Register  (WCR)  and  the  Memory  Address  Register  (MAR). 
A  channel  may  have  these  registers  either  in  K  or  in  Mp.  In  the 
setup  sequence  for  an  interlaced  input/output  operation,  the  POT 
instniction  transmits  to  the  interlace  a  data  word  made  up  of  the 
word  count  (that  is,  length)  and  the  starting  address  of  the  data 
block.  The  1.5-bit  Word  Count  Register  (WCR)  contains  the  data 
word  count  during  a  data  transfer.  The  number  of  data  words  is 
decremented  by  I,  and  the  new  count  replaces  the  old  one  in  the 
WCR  for  each  word  transmitted. 

The  Memory  Address  Register  (M.AR)  contains  the  starting 
destination  or  source  address  in  memory  of  the  transmitted  data. 
The  memory  locations  to  or  from  which  data  words  are  to  be 
transmitted  enter  the  M.\R  at  the  same  time  the  word  count  does. 
During  transmission  of  data,  the  interlace  increments  the  M.\R 
after  each  word  as  it  decrements  the  contents  of  the  WCR.  These 
two  registers  provide  the  interlace  control  of  block  transmissions. 
Obviously,  if  the  interlace  control  registers  are  in  Mp,  then  two 
extra  accesses  are  required  for  each  word  transferred. 

Memory  interface  connection  link 

Once  a  computer  is  equipped  with  a  multiple-access-to-memory 
feature,  one  or  more  Memory  Interface  Connections  (MIC)  can 
be  attached.  The  MIC  is  a  general  interface  to  the  computer  that 
allows  special  devices  to  access  Mp.  It  preserves  the  integrity  of 
the  memory  by  generating  the  parity  of  incoming  data  words  and 
checking  the  parity  of  words  read  from  memory  to  indicate  mem- 
ory failures.  The  device  that  is  connected  to  the  MIC  must  hold 
both  the  data  and  the  address  until  the  transmission  to/from 
memory  is  completed  (that  is,  MIC  does  not  have  registers). 

Conclusions 

The  SDS  computers  appear  to  be  the  first  attempt  to  design  several 
computers  at  the  same  time  with  a  common  ISP.  Over  a  longer 
time  span  other  compatible  computers  were  added  to  the  original 
910  and  920  as  technology  (and  marketing)  dictated.  The  series 
is  characteristic  of  well-designed  typical  24-bit  computers.  By 
increasing  the  arithmetic  capability,  the  series  could  also  be  used 
more  generally. 

References 

Scientific  Data  Systems  Reference  Manuals  for  the  930  and  9300  computers 


556  Part  6  [  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


APPENDIX  1    SDS  930  ISP  DESCRIPTION 


AppendJx  1 

SDS  930  ISP  Description 

The  description  defines  the  Instruction  Set  without  exact  assignment  of  operation  codes  to  instruction  names.  Input-outout 
instruction  actions  are  given  for  the  sinrple  controls,  but  do  not  include  the  action  of  the  channels  or  the  devices. 

Pc  State 

A<0:23> 

Accumulator;  main  arithmetic  register 

B<D:23> 

secondary  arithmetic  register  for  multiplier ,  quotient,  etc. 

AB<D:ll7>  :=  ADS 

combined  48  bit  arithmetic  register 

X<0:23> 

Index  Register 

P<10:23> 

Program  or  instruction  location  counter  for  16  kw 

Overf low/Ov 

set  on  integer  operations 

Carry  :=  X<D> 
Run 

used  in  multiple  precision  operations  to  link  words 

Itp  State 

Memory  [0 : 77777g  ]<0 ;  23>                                                                        32       vrimaru  memoru 

Two  3  bit  map  Cor  extension)  registers  extend  the  address  space  of  Mp  to  32  kw.    EM2  holds  a  4  kw  block  number  when  addresses 
20000-27777    are  used.    EMS  holds  the  4  kw  block  number  for  addresses  30000-37777  . 

EM2<D:2> 

Extension  Memory  registers 

EM3<0:2> 

Memory  Mapping  Process 
This  orocess  maps  the  16  kw  address  space  into  the  32  kw  physical  memory. 

M<D:23>[a  ]  :=  ( 

(a  <  20000g)   -»Memory  [a  ]<0:23> 

(20000g  sa  s  27777g)   -Memory  [EH2<0:2>aa<12:23>]O:23> 

(30000g  s  a)   -Memory  [EM3<0:2>Pa<]2:23>]<0:23> 

Pc  Console  State 

Individual  registers  in  Pc  can  be  read  and  written  from  the  console. 

Breakpoint  or  sense  switches 

Instruction  Format 

!nstruction/i<0:23> 

re  1  at  i  ve 

=  I<0>                                                               unused  by  ISP;  software  relocation  bit 

i  ndexi_ib  i  t/xb 

=  iO> 

opi_,code/Dp<2 :  8> 

=  i<2:8> 

popu)Code<0:5^ 

=  i<3:8>                                                             prograrmed  operation  code  value 

i  ndi  rect ub  i  t/ i  b 

=  i<9> 

y<10:23> 

=  i<IO:23>                                                         address  field  for  16  kw 

(1  miarocoded  instruction  bits  within  an  instruction 

Effective  Address  Calculation  Process 

e<10:23>:=  (-lib  -»  ( 
-^xb  -y; 

iterative  process  of  indefinite  indirect  addressing  until 
no  indirect  bit,  ib,  is  found 

xb   -»y  +  X)  ; 

ib  -( 

-,xb  -»(i<0cS:23> 

-M[y  ]<Oa9:23> 

xb  -(i<Cia9:23> 

<-M[y  +  X]<0a9:23>);  next  e)) 

el<l8:23>  :=  e<18:23>                                                                                shift  count 

Chapter  42  '  The  SDS  910-9300  series  557 


Instruction  Interpretation  Process 
Interrupt^interpretation  -»  ( 
instruction  '-HCP];  P  + 
Instructionusxecutlon)  ; 
I  nterrupt  Jn  terpretat  ion  -*  ( 
instruction  ^M[200o  +  20 


p  X  K^ddress  +   l,_^ddress]:  next 


Instruction  execution) 

Instruction  Set  and  Instruction  Execution  Process 
Instruction^xecution  :=  ( 
Load  and  Store  Croup 

LDA  ^  (A  *-M[e]); 

STA  ^  (M[e  ]  -A)  ; 

LDB       (e  -  M[e]) ; 

STB  -'MCeJ^B); 

LDX       fx  ^M[e]); 

STX  -»  (M[e]  ^X); 

EAX  -»  (X  «-e)  ; 

XMA  ->  (M[e]  *-A;  A  4-H[e])-, 
Arithmetic  Group 

SUB  -*  (Ov.CarryOA  -A  -  M[e]); 

ADO  ->  (Ov,  Carry  OA  -A  +  H[e]); 

sue  ^(Ov.CarrvOA  --A  -  M  [e  ]  -  Carry); 

ADC       (Ov.CarryOA  *-A  +  M  [e  ]  +  Carry); 

MIN       {Ov.Hte]  -  M  [e  ]  +  I)  ; 

AOM  -»  (Ov.MCe]  -  H  [e  ]  +  A) ; 

MUL  -*  (Ov.AB  ^A  X  M[e]); 

DIV  ^(Ov,B  ^AB/M[e];  A  <-AB  mod  M[e]): 
Logical  Group 

ETR       (A  wA  A  M[e]); 

MRG  -*  (A  ^A  V  M[e]) ; 

EOR       (A  ^A  SMCe]); 

Microcoded  Register  Exchange  Instruction 
Each  instruction  can  be  formed  from  a  series  of 


normal  interpretation 

fetch 

execute 

interrupt  interpretation 


load  A 

store  A 

load  B 

store  B 

load  index 

store  index 

load  index  from  e 

exchange  A  and 

subtract 
add 

subtract  with  Canru 
add  vith  Carry 
memoru  increment 
add  to  merr^oru 
multiply 
divide 

extract 
merge 

exclusive  or 


Tn-crovrogroTmed  operations.     Comuound  microcoded  instructions  are  shown  below 


without  a  ^  )  t 

CLA  ^ (A 

^0); 

^  clear  A 

CLB  -»  (B 

^0): 

p,,  clear  B 

CLR  ->  (AE 

clear  A  and  B 

CLX  ->  (X 

-0); 

ij.,  clear  X 

CAB  -  (B 

-A); 

]Xj  copy  A  into  B 

CBA  ~  (A 

-B): 

copy  B  into  A 

XAB  (A 

B;   B  -A); 

exch/mge  A  and  B 

CXB  ->  (B 

<-X); 

copy  X  into  B 

CBX  (X 

^B); 

copy  B  into  X 

XXB  ->  (X 

^B;  B  -X): 

exchange  X  and  B 

CAX  -»(X 

-A); 

^,  copy  A  into  X 

558  Part  6  [  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


CXA  ^  (A  ^X)  ; 

copy  X  into  A 

XXA       (A  ^X;   X  ^A); 

exchange  X  and  A 

CNA       (A          A)  ; 

not  A 

BAC       (A  ^  B;   B  ^  O)  ; 

copy  B  into  Aj  clear  B 

ABC  ^  (B  ^  A;   A  ^  0) ; 

copy  A  into  Sj  otear  A 

STE      (X<15:23>  ^B<15:23>;  X<0  :  U>      Iqn^extend  CB<]  5>) ) ; 

\i  J  store  exponent:  exponent  control  bit 

B<15:23  ><-  0)  ; 

LDE        (B<15:23>  X<15:23>); 

load  exponent 

XEE       (<B15:23>  ^X<15:23>;  X<]5:23>  ^B<15:23>; 

exchange  exponent 

X<0:li(>  -  signuextend(B<15>)); 

End  of  miaroQoded  instruction  group 

Shift  Group 

LRSH  -  (AB  ^  AB  /  2^'   [logical]) ; 

RSH       (AB  -  AB  /  : 
e  1 

RCY  -  (AB  '-AB  /  2  {rotatel); 

el , 

LSH  -  (Ov,AB      AB  X  2     )  ; 

logical  right  shift 

right  shift 

right  cycle 

left  shift 

LCY       (AB  ^  AB  X  2^'  [rotate"!); 

left  cycle 

NOD       (X  ^  X  -  normal  i  ze^exponent  (AB) 

normalize^  decrease  X 

AB      normal  ize(AB))  ; 

Skip  Test  Group 

SKE  ^  ((A  =  M[e3)  ^  (P  -  P  +  D); 

skip  if  A  =  M 

SKB  -'  ((M[e]  A   B)  =  0)  ^  (P  .-  P  +  1); 

skip  if  B  and  M  don't  compare  1  's 

SKN       (M[e]<0>  -'  (P  -  P  +  I)); 

skip  if  M  negative 

SKR  -  (Ov,M[e]  ^M[e]  -  1;   next  M[e]<0>  -  (p      P  +  1)); 

reduce  Mj  skip  <  0 

SKM  -  ((M[e]  A  B)  =   (A  A  B) )       (P  ^  P  +  1); 

skip  on  masked  M 

SKG        (A  >  M[e])   ^  (P  <_  P  +   1  )  ; 

skip  if  greater  than  M 

5KD  ->  (XR<0:23>  ^  abs  ( B<1  5  : 23>  -  M[e]<l  5  : 23>i  ; 

difference  exponents  and  skip 

(M[e]<l5:23>>B<15:23>)       (P  ^  P  +  1)): 

SKA       ((M[e]  A  A)   =  0)  ^  (P       P  +  1); 

skip  if  A  and  M  don't  compare  I's 

Branah  Group 

BRU       (P  ^  e) ; 

branch  unconditionally 

BRX       (X  ^  X  +  1  ;  x<g>  ^  P^  e). ; 

increment  IndeXj  Branch 

BRM  ^.  (Mre]<0>  ^Ov;  M  [e  ]<3 :  5>  <- EM3  ;  M  [e  ]<1  , 2  ,9>  ^  0 ; 

mark  place  and  branch 

M[e]<6:8>  ^EM2;  M[e]<!0:23>  <- P ;  next 

used  to  call  subroutines 

P  ^e  +  1); 

BRR  ^  (p  <_  M[e  ]  +  1  ;  Ov  ^  Ov  V  M[e]<D>)  ; 

branch  return;  used  in  terminating  subroutines 

Control  Group 

HLT  ->  (Run  ^  0)  ; 

halt 

NOP  -  : 

no  operation 

EXU  ^  (instruction  <-M[e]; 

execute 

(nstruction^execution) ; 

Overflow  Test  Group 

OVT  -  (Ov  -  (P  -  p  +  1);   (ov  -  0)); 

overflow  test 

ROV  ^   COv  <-  0)  ; 

reset  overflow 

REO      Cx<i  ^  e  X<1 5>)      {Dv  ^  I  )  I 

record,  exponent 

Chapter  42     The  SDS  910  9300  series 


559 


Breakpoint  Test  Group 

((BPT  1  A  BPT<1>)  V  (BPT  2  A  BPT<2; )  V  (BPT  3  A  BPT<3>)  V  (BPT  h  A  BPT<'|>) )  ->  (P  -P  +  1 
Memory  Extension  Register  Control  Group 

SET  -»  (in5truction<17>  -  (EM2  ^ i ns t rue t ion<21 :23>) ; 

instruct!on<l6>  -  {EM3  -  i  ns  t  rue  t  i  on<l  8 : 20>) )  ; 
EXT  -♦condi  t  ion  -  (P  -  P  +  1  )  ; 

eondition   :=   ( { i  r>s  t  rue  t  ion-:22-    A  (EM2  =  2))   A  ( i  ns  t  rue  t  ion<23>  A  (EM3  =  3))) 
POP  -  (M[0]<D.9:23>  '-OvnlnP;  P  -  1 00„  +  pop^eode); 


EOM 

->  lOw 

POT 

PIN 

^  io„ 

SKS 

^  I0„ 

.instruction^xecution 
instruct!  on  execution 
instruct  ion  execution 
,in5truction,_pxecutIon 
) 

Input-Output  Control  fvom  the  Pc 

KT  and  VJ^^s  State 
Devices  consist    of  the  following  parts: 

IO^Device[0:777773] 

IOJ)utput[0:77777g]<0:23> 

IO^input[0:77777g]<0:23> 

IO^Ready[0:77777g] 

IO^Select[0:77777g] 
io^unit<0:U> 


10  Instimction  Set 
EOM  (iOuiUnit 
POT 


{lO^SeIect:io^unit]  ^  I  O^Ready  [  i  o^un  1 1  ]  ( 
10^0utDUt[ioLjjnit]        M[e]:   io^unit  *-0); 

IO^Select[  io^uni  t]  A         1  O^Ready  [  i  o^un  i  t  ]  ->(P0T)): 

( lO^Select  [io^ni  t  ]  A   1 0  Jleady  [  i  o^un  !  t  ]  ->  ( 
^^[e]  ^  IO^Input[io^uni  t  ];    lo^unlt  ^0); 

10^Select[io^unI  t  ]  A  -i  I  O^Ready  [  io^un  1 1  ]"  ^  (PIN) ) ; 
( io^n  it  ^  e  :  next 

(I0^elect[io^unit]  a  1  O^Ready  [  i  o^un  i  t  j  ^  { 
P  ^  P  +  1 )  ; 
io^unit  ^0): 


Tntevmwt  Sustem  States 
I nterrupt 
i^RQi:0:63]<0:  15> 
!^ON[0:63]<0: I5> 

I^SiqnaI[0:63]<0:I5> 

K^ddress<0:5> 

l^address<0:3> 


^RQ[0:63]<0:15>  A   I  J]N[0 ; 63 ]<0 : 1 5> 


progvarmed  ovevator;  64  user  defined  instructions  called  via 
subroutine  link  in  ^10"] 

see  the  definition  of  the  10  instruction  set  below 


end  Instvuction^^xeaution:  not  including  Input  Output 
instructions 


name  lor  address)  of  a  specific  JO  device:  the  EOM  cormand 
is  first  given  to  select  the  specific  device:  subsequent 
coTmands  are  implicitly  to  the  selected  device 

Itwut  and  Output  Data  buffers  associated  with  specific 
devices 

bit  for  each  device  to  denote  when  device  is  ready  to  trans- 
mit data 

a  bit  Dithin  each  device  denoting  it  has  been  selected  for 
an  operation 

the  particular  io  device  selected  by  the  EOM  command; 


coTmand  to  select  or  address  the  device:  energize  outvut  M 
outout  data  cormand 

wait  until  ready 
input  data  cormand 

wait  until  ready 

skip  if  signal  is  not  set 


controls  whether  interrupts  will  be  processed 
array  of  1024  interrupt  reauests 

array  of  interrupt  enable  to  enable  or  inhibit  interruot 
reauests 

group  nwnber 

level  nierher  within  a  group  of  the  active  interruot 


560  Part  6  |  Computer  families 


Section  2  |  The  SDS  910-9300  series,  a  planned  family 


The  luixddress  and  Kjiddress  aomh\ne  (200g  +  20^  x  K  address  +  I^ddress)  to  establish  an  interrupt  address^  ^^^8  highest 
priority  and  200 ^  +  '^^'^'^ q         lowest  priority. 
Interrupt^level^tate[0:63]<0: 15>j 

There  are  three  states  associated  with  each  interrupt^  Inactive^  Waiting^  and  Active: 
Inactive  means  no  I^signal  is  present. 

Waiting  means  the  I^ignal  has  been  received  but  is  waiting  to  be  processed. 
Active  means  the  interrupt  has  caused  the  main  program  to  recognize  its  presence. 
The  instruction  in  M[200g  +  20 ^  x  Kuaddress  +  X^ddress]  is  executed  upon  interrupt.     There  are  two  kinds  of  interrupts:  single 
instruction  allows  one  instruction  to  be  executed  and  the  interrupt  level  state  is  changed  from  active  to  inactive;  and  normal 
reauires  that  a  mark  place  and  branch,  BPM,  instruction  to  be  executed  to  save  P.    At  the  completion  of  the  interrupt  vrograpiy 
a  branch  unconditional  (BRU)  indirectly  via  the  BBM  instruction  restores  the  interrupt  level.     (That  is,  the  Jnterrupt^level^tate 
is  changed  from  Active  to  Inactive,  and  another  I^ignal  can  be  processed. ) 
I  nter  rupt^i  nterpretation 

A  state  denoting  that  an  interrupt  is  to  he  processed  or  the  interrupt  level  state  to  be  changed  from  Waiting  to  Active  for 
normal  interrupts  and  Waiting  to  Active  to  Inactive  for  single  interrupts.  The  interrupt  processed  is  the  highest  of  those 
waiting  provided  there  are  no  interrupts  of  highest  level  in  the  Active  state. 

Interrupt  Control  Instructions 

EIR  -»  {Interrupt  <-  1 ) :  enable  interrupt;  turn  on  mode 

DIR      (Interrupt  ^  O) ;  disable  interrupt;  turn  off 

I  ET      (Interrupt  -.  P         +  0;  interrupt  test;  skip  if  on 

I  Dj      (_i  Interrupt  ->P--   P  +  ])■  interrupt  disable  test;  skip  if  off 
POT  instruction  to  control  the  Interrupt  System.    E0f^{20020'\  is  first  given  to  select  the  Interrupt  System. 

(pot  a  IO^Ready[20020])  ->  (  interrupt  control  instructions 

(c  =  1)  ^  I  J)N[a]<0:15>  *- 1  J)N[a]<0:I5>  V  B<0:15>t  arm  a  channel  level  groxxp 

(c  =  2)       1^0N[a]<0:I5>  «-  ljDN[a]<0:I5>  V      B<0il5>:  disarm  a  channel  level  group 
(c  =  3)       I^ON[a]<0:]5>  4-b<0:]5>);  ^  channel  level  group 


a<0:5>  :=  M[e]<0:5> 
b<0: I5>:=  M[e ]<8:23> 
c<0: 1>  ;=  M[e]<6:7> 


group  select  or  K^ddress 
data  for  I^address 
command  control  bits 


Section  3 


The  IBM  System/360- 

a  series  of  planned  machines  which  span 

a  wide  performance  range 


In  this  introduction,  besides  making  some  general  comments 
on  the  IBIVl  System/360,  we  will  attempt  an  analysis  of  the 
performance  and  costs  of  the  series.  Performance  is  notoriously 
difficult  to  measure,  as  we  noted  in  Chap.  3,  and  costs  are  even 
more  so.  With  respect  to  the  latter,  what  is  publicly  available 
are  price  data,  not  manufacturing-cost  data. 

These  prices  reflect  not  only  marketing  policies  but  also 
accounting  policies  within  the  organization  for  the  attribution 
of  costs  to  product  lines.  For  example,  we  have  had  to  determine 
Pc  and  Mp  prices  on  the  basis  of  incremental  Mp  prices  within 
a  C.  Nevertheless,  the  360  series  provides  two  things  which 
make  a  comparative  analysis  worthwhile.  First,  the  common  ISP 
makes  simple  performance  measures  more  comparable;  sec- 
ond, the  common  manufacturer  makes  relative  prices  more  a 
reflection  of  relative  costs  than  would  otherwise  be  the  case. 
Neither  of  these  aspects  is  perfect,  as  we  will  note  at  several 
points  in  the  discussion.  Nevertheless,  the  360  series  provides 
as  good  an  opportunity  to  attempt  cost/ performance  analysis 
as  we  know.  Indeed,  this  opportunity  has  already  been  grasped 
in  a  paper  by  Solomon  [1966],  which  we  have  found  very  valua- 
ble and  use  to  provide  a  basis  of  Pc  power. 

Analyses  of  the  type  we  attempt  here  produce  only  rather 
crude  pictures  and  are  subject  to  question  if  all  the  input  data 
are  not  very  carefully  checked.  We  have  not  done  the  latter, 
depending  instead  on  published  sources.  For  the  purpose  of  this 
book,  illustration  of  the  style  of  analysis  seems  sufficient.  In 
addition,  using  a  performance  measure  based  only  on  Pc  power 
measurements,  as  we  do  here,  leaves  many  questions  un- 
answered because  it  does  not  address  the  soft  areas  of  analysis 
relating  to  throughput,  task  environment,  and  the  operating 
system  software. 

Unlike  the  other  introductions  in  this  book,  the  reader  may 
find  it  worthwhile  to  scan  this  one,  read  the  chapters  in  the 
section,  and  then  return  to  this  introduction  when  the  system 
has  become  somewhat  familiar. 

The  IBM  System/360  is  the  name  given  to  a  third-genera- 
tion series  of  computers  which  constitute  the  current  primary 
IBM  product  line.  They  all  have  a  common  ISP  but  differ  in  inter- 


preter speeds  and  PMS  structure.  Many  PMS  elements  are 
used  in  common,  particularly  K's.  Ms's,  and  T's. 

The  System/ 360  series  is  presented  both  because  IBM's 
market  dominance  makes  it  the  most  prevalent  current  com- 
puter and  because  its  implementations  span  the  largest  per- 
formance and  price  range  of  any  series.  The  C('360)  models 
should  be  compared  with  one  another  (Table  1)  to  be  aware 
of  their  capabilities.  Their  introduction  dates  and  their  relation- 
ship are  shown  in  Fig.  1.  Chapters  43,  44.  and  32  discuss  the 
logical  structure  of  the  system,  the  implementations,'  and  the 
microprogrammed  Model  30. 

A  succinct  description  of  the  design  goals  and  innovations 
IS  given  in  the  abstract  of  the  paper  Architecture  of  the  IBM 
System  360  [Amdahl  et  al.,  1964a]: 

'Chapters  43  and  44  are  from  IBM  Sysli-ms  Journal,  vol.  3,  no.  2,  1964,  which 
was  devoted  exclusively  to  the  System/360.  The  other  articles  (listed  in  the 
bibliography)  are  recommended  for  additional  details. 


Model 
1130' 
1800' 


50 
80 
62 
64 


67 

TSS(softwore) 
70 
75 
85 
91 
92 
95 

RCA  Spectro  70'' 


-DD- 
-DD- 


—  W 
A-W 


A-onnounced;  D-delivery;  E-exhibited,  W-withdrow 
'Not  part  of  System/ 360 
^Uses  some  ISP 


Fig.  1.  IBM  System/360  models  introduction  dates. 


Part  6  I  Computer  families 


Section  3  j  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


The  architecture"  of  the  newly  announced  IBM  System/350 
features  four  innovations: 

1  An  approach  to  storage  which  permits  and  exploits  very 
large  capacities,  hierarchies  of  speeds,  read-only  storage 
for  microprogram  control,  flexible  storage  protection,  and 
simple  program  relocation. 

2  An  input/output  system  offering  new  degrees  of  concur- 
rent operation,  compatible  channel  operation,  data  rates 
approaching  5,000,000  characters/second,  integrated 
design  of  hardware  and  software,  a  new  low-cost,  multi- 
ple-channel package  sharing  mainframe  hardware,  new 
provisions  for  device  status  information,  and  a  standard 
channel  interface  between  central  processing  unit  and 
input/output  devices. 

3  A  truly  general-purpose  machine  organization  offering 
new  supervisory  facilities,  powerful  logical  processing 
operations,  and  a  wide  variety  of  data  formats. 

4  Strict  upward  and  downward  machine-language  compati- 
bility over  a  line  of  six  models  having  a  performance 
range  factor  of  50. 

The  above  four  featured  innovations  are  all  stated  as  IBM 
Corporation  design  results.  It  seems  better  to  analyze  them  in 
terms  of  design  constraints  and  implementation  results.  It 
appears  that  the  design  constraints,  from  marketing  and  man- 
agement directions,  were  compatibility  (item  4  above)  and  the 
use  of  common  peripheral  equipment  (item  2  above).  Thus  we 
can  measure  the  360  design  in  terms  of  how  well  it  meets  these 
constraints.  With  some  minor  exceptions,  all  the  peripheral 
components  existed  at  the  time  of  the  design  and  had  been 
used  with  other  IBM  computers;  thus  a  goal  was  already  real- 
ized. A  measure  of  the  design  can  also  be  based  on  a  compari- 
son with  alternative  designs.  In  the  following  sections  we  sug- 
gest that  several  forms  of  multiprocessing  would  yield  higher 
performance  at  lower  cost.  A  difficult  and  important  constraint, 
though  not  mentioned  above,  is  the  necessity  of  program  com- 
patibility with  almost  all  earlier  IBM  computers. 

It  should  be  noted  that,  at  the  outset  of  the  IBM  System/360 
announcement,  another  company,  RCA,  adopted  the  360  ISP 
as  a  design  constraint  for  its  own  future  computer  development. 
Although  some  price-performance  characteristics  appear  to  be 
better  in  the  RCA  series,  the  implementation  scheme  is  similar. 

"  The  term  invltiterhin-  is  used  here  to  describe  the  attributes  of  a  system  as  seen 
by  the  programmer,  i.e..  the  conceptual  structure  and  functional  behavior,  as 
distinct  from  the  organization  of  the  data  flow  and  controls,  the  logical  design, 
and  the  physical  implementation. 


The  lower  RCA  prices  do  not  reflect  entirely  implementation  and 
technology  but  include  RCA  marketing  and  profit  strategy.  In 
addition,  of  course,  there  should  have  been  lower  development 
costs. 

An  interesting  aspect  of  the  design  is  the  method  used  to 
implement  the  individual  computer  models  (of  the  range)  and 
their  associated  costs.  From  the  standpoint  of  innovation,  the 
360  was  the  first  computer  series  to  cover  a  wide  range.  The 
more  basic  P's  (Models  20  ^  65)  were  implemented  via  a 
microprogrammed  processor.  This  is  based  on  a  computer 
program  within  an  M(read  only),  i.e.,  a  Read  Only  Storage/ROS, 
to  interpret  the  common  ISP.  A  payoff  from  this  implementation 
strategy  is  a  solution  to  the  "compatibility  design  constraint," 
which  is  the  ability  to  provide  compatibility  with  the  customer's 
previous  (IBM)  machine,  which  of  course  was  not  a  member 
of  the  360  series.  This  is  undoubtedly  the  most  difficult  con- 
straint to  meet  in  the  P  designs,  and  probably  the  most  signifi- 
cant real  innovation.  From  the  marketing  viewpoint,  it  provided 
the  user  with  a  crutch  to  go  from  a  former  IBM  computer  to 
the  System/360.  This  is  accomplished  through  "emulation," 
which  (as  defined  by  IBM)  means  the  ability  of  one  C  to  inter- 
pret another's  programs  at  a  reasonable  performance  level. 
These  emulations  are  realized  by  various  microprogrammed  P's 
being  designed  to  interpret  both  the  360  ISP  and  one  or  more 
of  IBM  704,  709,  1401,  1410,  1440,  1460,  1620,  7010,  7040, 
7044,  7070,  7074,  7090,  7094. 

Most  of  the  above  ISP's  have  a  different  structure  from  the 
360  ISP.  For  example,  the  1401  (Chap.  18)  series  instructions 
and  data  are  variable-length  character  strings;  the  1620  has 
variable-length  data  strings;  the  704  series  process  fixed-  and 
floating-point  data  with  single-address  instructions;  and  the 
7070  is  a  fixed-word  decimal  computer.  Thus  the  360  C's  repre- 
sent the  first  machines  to  be  two  logical  processors  in  the  same 
physical  implementation. 

The  emulated  speeds  are  often  better  than  that  of  the  origi- 
nal hardwired  computer.  This  is  not  surprising,  considering  the 
change  in  technology;  it  is  a  very  attractive  feature.  The  360 
Mp  performance  is  often  a  factor  of  5  to  10  times  the  "emu- 
lated" computers;  and  the  M(ROS)  data  rates  are  a  factor  of 
25  times  the  Mp's.  For  example,  the  Model  65  emulating  a  7090 
runs  faster  than  a  hardwired  7090  (Table  1).  The  use  of  an 
M(ROS)  for  defining  an  ISP  is  questionable  if  we  ignore  the 
emulation  constraint.  Note,  by  way  of  evidence,  that  the  hard- 
wired models  91  and  44  have  the  lowest  cost-to-performance 
ratios  in  the  series. 

There  are  minor  deviations  in  the  particular  models,  but  all 


564  Part  6  |  Computer  families 


Section  3  |  Tlie  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


implementations  belong  to  a  common  ISP  subset.  The  Model 
20  and  the  Model  91,  the  extremes  of  the  series,  deviate  most 
from  the  standard  360  ISP.  The  range  of  models  (Table  1) 
shows  the  comparative  effects  of  implementation  on  the  actual 
processing  times.  For  example,  the  designers  of  the  various  C's 
were  constrained  by  memory  bandwidths.  Since  the  core  mem- 
ories have  about  the  same  cycle  time  (0.75  ^  2.0  microsec- 
onds), variation  in  bandwidth  is  obtained  by  increasing  the  data 
path  width  from  8  to  64  bits  and  by  increasing  the  number  of 
independent  Mp's.  By  looking  at  just  Mp  bandwidth,  for  models 
30  ^  65,  we  obtain  a  range  of  5.3  to  85  megabits/s,  corre- 
sponding to  a  performance  range  of  about  1  to  16.  By  doubling 
the  number  of  independent  memories,  this  factor  can  be  in- 
creased to  32.  These  models  correspond  to  a  Pc  performance 
range  of  1  to  32.  Although  we  might  expect  a  narrower  range 
(based  on  Mp  speed),  the  range  can  be  increased  by  perform- 
ance suppression  (at  the  low  end).  Power  range  can  be  in- 
creased by  lowering  the  absolute  performance  of  Model  30.  This 
is  accomplished  by  making  performance  tradeoffs  to  lower  cost. 

Logic  technology 

The  logic  of  the  360  series  is  realized  in  a  hybrid  technology, 
composed  partly  of  integrated-circuit  techniques  and  partly  of 
the  solid-state  techniques  standard  in  second-generation  ma- 
chines. It  is  a  "thick-film"  technology  that  deposits  the  circuitry 
on  a  ceramic  substrate.  This  is  called  Solid  Logic  Technology 
(SLT)  and  is  used  solely  by  IBM.  This  production  technique 
allows  only  for  the  fabrication  of  passive  circuit  elements  on 
the  substrate.  The  semiconductor  elements  (diodes  and  tran- 
sistors) are  produced  independently,  using  standard  semicon- 
ductor production  techniques  on  a  wafer.  The  semiconductors 
are  then  cut  and  bonded  to  the  substrate,  and  the  complete 
SLT  logic  unit  is  encapsulated.  The  substrates  correspond 
roughly  to  logic  elements  (gates,  inverters,  flip-flops,  etc.).  The 
SLT  units  are  placed  on  larger  printed-circuit  boards. 

Although  SLT  differs  fundamentally  from  integrated-circuit 
technology,  the  overall  size  of  the  final  printed-circuit  boards 
is  about  the  same.  At  the  time  the  decision  was  made  to  develop 
the  technology,  it  was  unclear  that  integrated-circuit  technology 
would  reach  mass-production  state.  Thus  the  SLT  program  was 
an  intermediate  design  prior  to  integrated-circuit  technology. 
The  two  approaches  are  about  the  same  from  the  standpoint 
of  reliability,  especially  when  one  considers  the  soldered 
printed-circuit  mounting.  The  number  of  connections  to  the 
printed-circuit  board  are  about  the  same.  The  production  tech- 


nology of  the  360  series  is  outstanding,  perhaps  surpassed  only 
by  the  360  marketing  plan. 

The  Instruction-set  processor 

The  following  discussion  covers  only  the  Pc.  The  instruction  set 
consists  of  two  classes,  Scientific  ISP  and  Data  Processing  ISP, 
which  operate  on  the  different  data-types.  These  data-types 
correspond  roughly  to  the  IBM  7090  (Chap.  41)  and  IBM  1401 
(Chap.  18).  For  the  scientific  ISP  they  are  half-  and  single-word 
integers,  address  integers,  single,  double,  and  quadruple  (Model 
85)  floating  point,  and  logical  words  (boolean  vectors);  for  the 
data-processing  ISP  they  are  address  or  single-word  integers, 
multiple  byte  strings,  and  multiple  digit  decimal  strings.  These 
many  data-types  give  the  360  strength  in  the  minds  of  its  various 
types  of  users.  The  many  data  types  may  be  of  questionable 
utility  and  constrain  the  ISP  design  by  having  to  perform  few 
operations,  rather  than  having  a  more  complete  operation  set 
for  a  few  basic  data  types.  The  viewpoint  taken  here  is  a  biased 
one;  we  feel  that,  unless  a  particular  data-type  adds  significant 
processing  and  storage  capability,  it  should  not  be  fundamental 
to  the  ISP.  The  decimal-string  integers  appear  to  cost  in  storage 
and  processing  time.  Their  redeeming  virtues  are  that  little  or 
no  conversion  is  required  at  input  or  output  time,  and  their 
internal  representation  is  easily  recognized  by  people. 

Advantages  of  general-registers  organization 

The  ISP  uses  a  general-register  organization.  The  ISP  power 
can  be  compared  with  several  similar  general-register  ISP 
structures  such  as  those  of  the  UNIVAC  1107,  1108;  the  DEC 
PDP-6,  PDP-10;  the  SDS  Sigma  5,  Sigma  7;  and  the  early 
general-registers-organized  machine  Pegasus  (Chap.  9).  Of  the 
above  machines  the  360  Scientific  ISP  appears  to  be  the 
weakest  in  terms  of  instructions  and  the  completeness  of  the 
instruction  set. 

For  example,  in  Pegasus,  PDP-6,  and  the  UNIVAC  1107 
symmetry  is  provided  in  the  instruction  set.  For  any  binary 
operation  b  the  following  are  possible: 

GR  ^GR  b  Mp 
GR  ^GR  b  GR 
Mp  ^GR  b  Mp 
Mp  ^  Mp  b  Mp 

The  360  ISP  provides  only  the  first  two.  Additional  instructions 
(or  modes)  would  increase  the  instruction  length. 


Section  3     The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range  565 


In  the  System/360  the  only  advantage  taken  of  general 
registers  Is  to  make  them  suitable  for  use  as  index  registers, 
base  registers,  and  arithmetic  accumulators  (operand  storage). 
Of  course,  the  commitment  to  extend  the  general-purposeness 
of  these  general  registers  would  require  more  operations.  Chap- 
ter 3  (page  61)  suggests  advantages  for  general  register 
organizations. 

The  360  has  a  separate  set  of  general  registers  for  floating- 
point data.  This  provides  more  processor  state  and  temporary 
storage  but  again  detracts  from  the  general-purpose  ability  of 
the  existing  registers.  Special  commands  are  required  to  ma- 
nipulate the  floating-point  registers  independent  of  the  other 
general  registers.  Unfortunately  the  floating-point  instruction 
set  is  not  quite  complete  (e.g.,  fixed-  to  floating-point  conver- 
sion), and  several  instructions  are  needed  to  move  data  be- 
tween the  fixed  and  floating  registers. 

When  multiple  data-types  are  available,  it  is  desirable  to  have 
the  ability  to  convert  among  them  unless  the  operations  are 
complete  in  themselves.  The  System/360  might  use  more  data 
conversion  instructions,  for  example,  between  the  following; 

1  Fixed  precision  integers  and  floating-point  data 

2  Address-size  integers  and  any  other  data 

3  Half-word  integer  and  other  data 

4  Decimal  and  byte  string  and  other  data  (decimal  string 
to  and  from  byte  string  conversion  is  provided) 

Some  of  the  facilities  are  redundant  and  might  be  handled 
by  better  but  fewer  instructions.  For  example,  decimal  strings 
are  not  completely  variable-length  (they  are  variable  up  to  31 
digits,  stored  in  16  bytes),  and  so  essentially  the  same  arith- 
metic results  could  be  obtained  by  using  fixed  multiple  length 
binary  integers.  This  would  remove  the  special  decimal  arith- 
metic and  still  give  the  same  result.  If  a  large  amount  of  fixed 
field  decimal  or  byte  data  were  processed,  then  the  binary- 
decimal  conversion  instructions  would  be  useful. 

The  communication  instructions  between  Pc  and  Pio  are 
minimal.  The  Pc  must  set  up  Pio  program  data,  but  there  are 
inadequate  facilities  in  Pc  for  quickly  forming  Pio  instructions 
(which  are  actually  yet  another  data-type).  There  are,  in  effect, 
a  large  number  of  Pio's  as  each  device  is  independent  of  all 
others.  However,  signaling  of  all  Pio's  is  via  a  single  interrupt 
channel  to  Pc. 

The  Pc  state  consists  of  26  words  of  32  bits  each: 


1  Program  state  word,  including  the  instruction  counter  (2 
words) 

2  Sixteen  general  registers  (16  words) 

3  Four  2-word  floating-point  general  registers  (8  words) 

Many  instructions  must  be  executed  (taking  appreciable  time) 
to  preserve  the  Pc  state  and  establish  a  new  one.  A  single 
instruction  would  be  preferable;  even  better  would  be  an  in- 
struction to  exchange  processor  states,  as  in  the  CDC  6600 
(Chap.  39). 

Addressing  and  multiproaramming 

The  methods  used  to  address  data  in  Mp  have  some  disad- 
vantages. It  IS  impossible  to  fetch  an  arbitrary  word  in  Mp  in 
a  single  instruction.  The  address  space  is  limited  to  a  direct 
address  of  only  2'-  bytes.  Any  Mp  access  outside  the  range 
requires  an  offset  or  base  address  to  be  placed  in  a  general 
register.  Accesses  to  several  large  arrays  may  take  significant 
time  if  a  base  address  has  to  be  loaded  each  time.  The  reason 
for  using  a  small  direct  address  is  to  save  space  in  the  in- 
struction. We  know  of  no  published  attempt  to  analyze  the 
tradeoffs,  even  of  instruction  efficiency  alone,  although  un- 
doubtedly such  comparisons  were  made  within  IBM. 

Another  difficulty  of  the  360  addressing  is  the  inhomogeneity 
of  the  address  space.  Addressing  is  to  the  nearest  byte,  but 
the  system  remains  organized  by  words;  thus,  many  addresses 
are  forced  to  be  on  word  (and  even  double-word)  boundaries. 
For  example,  a  double-precision  data-type  which  requires  two 
words  of  storage  must  be  stored  with  the  first  word  beginning 
at  a  multiple  of  an  8-byte  address.  (However,  the  Model  85, 
which  is  a  late  entry  in  the  series,  allows  arbitrary  alignment 
of  data-types  with  word  boundaries.)  When  a  general  register 
IS  used  as  a  base  or  index  register,  the  value  in  the  index  register 
must  correspond  to  the  length  of  the  data-type  accessed.  That 
IS,  for  the  ith  value  of  a  half  integer,  single  integer,  single 
floating,  double  floating  (long),  and  quadruple  floating  (ex- 
tended), I  must  be  multiplied  by  2,  4,  4,  8,  and  16,  respectively, 
to  access  the  proper  element. 

A  single  instruction  to  load  or  store  any  string  of  bits  in  Mp 
(as  provided  in  the  IBM  Stretch)  would  provide  a  great  deal  of 
generality.  Provided  the  length  were  up  to  64  bits,  such  an 
instruction  might  eliminate  the  need  for  the  more  specialized 
data-types. 

A  basic  scheme  for  dynamic  multiprogramming  is  nonexist- 
ent (i.e.,  although  static  multiprogramming  is  done,  relocation 


566  Part  6  |  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


hardware  is  not  present).  Only  a  simple  method  of  Mp  protec- 
tion is  provided,  using  protection  keys  (see  Chap.  43,  page  597). 
This  scheme  associates  a  4-bit  number  (key)  and  a  1-bit  write 
protect  with  each  2  kby  block,  and  each  Pc  access  must  have 
the  correct  number.  Both  protection  of  Mp  and  assignment  of 
Mp  to  a  particular  task  (greater  than  2^  tasks)  are  necessary 
in  a  dynamic  multiprogramming  environment.  Although  the 
architects  of  System/360  advocate  its  use  for  multiprogram- 
ming, the  operating  system  does  not  enforce  conventions  to 
enable  a  program  to  be  moved,  once  its  execution  is  started. 
Indeed,  the  nature  of  the  360  addressing  is  based  on  absolute 
binary  addresses  within  a  program.  The  later  experimental 
Model  67  does,  however,  have  a  very  nice  scheme  for  protection, 
relocation,  and  name  assignment  to  program  segments  [Arden 
et  al.,  1966]. 

PMS  structures  and  implementations  of  the  computer 

The  PMS  structures  of  the  various  models  in  System/360  are 
basically  similar,  except  for  the  upper  end  of  the  series  and  for 


the  Model  44  (complete  compatibility  can  be  purchased  as  an 
option).  We  take  up  the  main  group  first  and  then  discuss  the 
others  individually. 

Models  30.  40,  50,  and  65 

The  PMS  of  Models  30,  40,  and  50  is  the  tree-structured  Mp-Pc 
shown  in  Fig.  2.'  They  all  use  a  P. microprogram,  although 
with  different  ISP's.  Some  gross  characteristics  are  given  in 
Table  1.  The  Pc  of  Model  65  is  also  microprogrammed,  but  it 
has  hardwired  Pio's.  A  PMS  diagram  of  Model  65  (and  Model 
75)  is  given  in  Fig.  3. 

The  C  structures  with  M(ROS)  use  a  single  physical  P. mi- 
croprogram to  realize  the  Pc,  the  Pio('Multiplexor  Channel), 
and  the  Pio('Selector  Channel).  This  technique  of  using  a  single 
shared  physical  P  for  multiple  logical  P's  with  fast  changing 
of  P. state  is  the  same  one  that  Pio('Multiplexor)  uses.  The 

'  The  structure  of  the  Mp's  does  not  include  the  local  M's  used  for  access  control, 
i.e.,  the  storage  protect  key  mechanism,  which  it  is  hoped  the  student  will  forget 
about  (forever). 


Mpl  . 


Mp(#0:3)^- 


(io;  #1:192,  'Multi- 
plexor) ,  do:  n  ,2,3^ 
Selector) 


-L(to:  external  C(S))^ 


-K(#0:191'')« 


ll:192:  tm:  'Multi 
plexor  Bus:  to:  B 
#1  .2,3°  :  fixed:" 
'Selector  Bus: 
to:  8  K 


-"I— K(#l 
KC#0;7)* 


1  See  Table  1  for  parameters. 


=  P(c,io) 
Mp  


"hp5(12R  by:  8  b/by) 

  L (' Selector , Multi plexor  Busses)- 


P (mi  croproqram) 

-Mp(read  only:  microprogram;    '3^0  ISP  program 
-Mp(workinq) 
^iPresent  only  in  Model  50 
*  See  Fi  gures   II   to  16. 

^Mp('2361-2  Large  Capacity  Store/LCS:  R  us/w:   ta:   3.2  us;  2621'l'4 

(8,1  parity)  b/by) 
^Only  8  physical  K's. 
'See  Chapter        for  parameters. 


8  by/w; 


Fig.  2.  IBM  System/360  Models  30,  40,  and  50  PMS  diagram. 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range  567 


T . consol e  - 

Mp(/l'0:3)'  1 

pS^- 

—  ^c(('2065;  microprogrammed)!  '2075:         Table  1) 

K('Direct) 

Mp  (#0 : 3)^_S^ 

—  P('2870)    :=  r-5_| — Pio  (#1  :  192)''  Stm — ~|   K  (*0 :  1 9 1' )'' 

[_       Lpio(#l:ii)^  Sfx —  K(/(0:7)" 

—  P('2860)    :=    Q-S  Pio(#l:3)''^   Sfx— 3    K(*0:7)' 

_P('2860)    :=    [Is  Pio(/'l:3)'   Sfx— T]   K(#0:7)^ 

'Mp('2365-3) 

-L 

(Mp(/'0.1;    '2365-2;  core:  .75  vs/w;  8  by/w;   16  kw;   (8,1  parity)  b/by)-S-) 

=Mp('236l-2 

Large  Capacity  Store/LCS;  8  us/w:  t. access:  3.2  us;  262  kw;  8  by/w;   (8,1  parity) 

6/ by) 

''5(8  H;  1.  P: 

time  multiplexed;  concurrency : 1 :   'Bus  Control  Unit/BCU) 

*Pio('2870  10  Multiplexor  Channel) 

^Pio('2870  10  Selector  Subchannel) 

'Pio('2860  Selector  Subchannel) 

Only  8  phys 

ical 

K's 

^ See  Figures 

1 1 

to  16. 

Fig.  3.  PMS  structure  for  IBM  System/360  Models  65  and  75  PMS  diagram. 

Pio('Multiplexor)  is  equivalent  to  multiple  Pio's.  Within  the 
physical  P  both  interrupts  and  polling  are  used  to  switch  among 
the  P's.  Polling  is  used  to  service  the  several  P's  since  the  main 
program  loop  of  the  ISP  interpreter  returns  to  a  common  point 
each  time  the  next  instruction  is  fetched.  That  is,  the  interpre- 
tation cycle  for  the  360  ISP  starts  by  fetching  the  instruction, 
proceeds  to  fetch  the  operands,  executes  the  instruction,  and 
then  returns  results  to  Mp.  The  instruction-interpretation  proc- 
ess takes  only  a  few  Mp  references  for  most  instructions. 

A  few  instructions  require  a  long  (or  indefinite)  interpreta- 
tion time,  e.g.,  character  translate,  edit,  etc.,  since  the  opera- 
tions are  on  character  strings.  Here,  the  iterative  program  loop 
which  operates  on  each  character  of  the  string  must  test  the 
attached  K's  to  detect  when  the  Pio  interpreter  is  to  be  run  for 
data  transfers.  The  long  instructions  can  take  several  hundred 
microseconds  and  cannot  be  interrupted;  thus  the  response 
time  for  an  interrupt  can  be  very  poor.  Figure  4  gives  a  simpli- 
fied picture  of  the  registers  organization  of  a  Model  50,  but  it 
is  also  typical  of  Models  30,  40,  and  65. 

The  actual  System /360  ISP  interpretation  program  in  each 
of  the  models  Is  different.  In  addition,  each  model  has  micro- 
programs for  interpreting  other  ISP's  through  emulation.  Tucker 
[1967]  discusses  how  the  models  were  changed  as  the  emula- 
tion constraint  was  added.  Table  1  gives  the  computers  which 
each  of  the  models  can  emulate.  A  register  structure  of  the 
C('30)  and  the  operation  for  the  P. microprogram  ISP  are  given 


in  Chap.  32.  page  386.  Tables  2  and  3  in  Chap.  44  give  the 
additional  parameters  which  influence  the  instruction  inter- 
pretation rate  of  the  P. microprogram.  The  significant  param- 
eters for  a  P. microprogram  are  the  M(ROS)  hardware  char- 
acteristics (speed,  size,  and  information  width);  the  number 
of  fields  in  the  M(ROS)  instructions,  which  gives  an  indication 
of  the  number  of  control  functions  performed  in  parallel;  the 
M(general  register)  rates  and  their  location  in  the  structure; 
the  Mp  data  rate;  and  the  characteristics  of  M(temporary) 
within  P.  The  activity  of  transferring  data  from  a  K,  via  the 
Pio('Selector),  is  done  concurrently  with  normal  instruction 
interpretation  in  Models  30,  40,  and  50.  A  program  in  M(ROS) 
sets  up  the  data  transmission  with  Mp,  and  transmission  is 
controlled  by  an  independent  hardware  control. 

Model  20 

This  model  is  a  subset  of  the  System /360.  It  has  eight  16-bit 
general  registers.  It  is  possible  to  write  programs  which  will  run 
on  both  the  Model  20  and  other  models.  Model  20  does  not 
have  Pig's,  and  Pc  issues  instructions  to  control  the  attached 
K's. 

Model  25 

The  Model  25  is  an  interesting  C.  Perhaps  some  of  the  interest 
of  the  authors  is  caused  by  the  mystery  (to  the  authors)  as  to 
what  Its  ISP  is.  Its  ISP  is  no  doubt  described  in  maintenance 


568  Part  6  |  Computer  families  Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


ROS 

Read  Only  Storoge 
Micro-Coded  Sequencing 
Control 


Main  Storage 


Multiplexer  Channel 
Control  Storage 


Locol  Storage 

General  Registers 
Flooting-Poinf  Registers 
Selector  Channel 
Control  Storage 
Working  Registers 


A  =  One  byte  wide  data  path 
B  =  Four  byte  wide  dote  poth 


Capac  ity/Number 

Data  Width 

Access/Speed/Rote 

General  registers 

16 

4  bytes 

0.5  microsecond 
R/W  cycle/4  bytes 

Floating-point  registers 

4 

8  bytes 

0.5  microsecond 
R/W  cycle/4  bytes 

Adder 

4  bytes 

0.5  microsecond 

Local  storage 

0.5  microsecond 
R/W  cycle/4  bytes 

Reod  only  storage 

0.5  microsecond 
Rd  cycle 

Basic  machine  cycle 

0.5  microsecond 

Multiplexer  channel 
Burst  mode 
Multiplex  mode 

1  byte 

1  byte 

Selector  channel 

4  bytes 

Dato  transfers 
Processor  to  storoge 
Storoge  to  storage 
Selector  chonnet  to  processor 
Multiplexer  chonnel  to  processor 
Control  unit  to  channel 

4  bytes 
4  bytes 
4  bytes 
1  byte 
I  byte 

Fig.  4.  IBM  System/360  Model  50  data-flow  diagram  and  system  characteristics.  (Courtesy  of  International 
Business  Machines  Corporation.) 


Section  3  i  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range  569 


manuals.  We  can  make  the  following  observations  based  on  its 
characteristics  taken  from  its  manual  of  Functional  Character- 
istics. These  appear  in  Table  1.  The  observations  are: 

1  It  has  a  very  high-performance  Mp,  namely,  Mp(core; 
.9  tis/w,  16|24|32|48  kby;  2  by/w);  the  Mp  power  is  al- 
most that  of  a  Model  50. 

2  There  is  a  relatively  straightforward  Pc  which  is  micro- 
programmed. The  Pc  uses  Mp  for  its  memory.  The  Sys- 
tem/360 ISP  IS  defined  in  conventional  M(read, write). 
Of  the  Mp(48  kby)  16  kby  is  reserved  for  a  microprogram. 

3  Its  performance  is  between  that  of  Models  20  and  30, 
performing  a  360  ISP  instruction  in  about  80  jus. 

4  The  penalty  paid  (slowdown  factor)  to  interpret  the  360 
ISP  is  therefore  80/1.8  -  45. 

5  A  small  180-nanosecond  local  store  is  used  for  operands. 

6  The  Pc  cost  appears  to  be  about  the  lowest  in  the  series. 

We  should  ask  ourselves: 

1  Why  do  we  want  an  intermediate-level  P. microprogram 
with  its  own  M. read-only,  as  in  the  other  processors? 
These  P's  just  seem  to  waste  power. 

2  Why  should  we  bother  to  implement  an  intermediate-level 
360  ISP?  We  know  the  final  user  will  write  programs  in 
a  much  higher  level  language.  Thus  two  levels  of  inter- 
pretation are  required  instead  of  one.  It  is  assumed  that 
to  program  a  given  task  will  take,  say,  x  fis  if  using  the 
360  ISP.  We  assume  the  same  task  programmed  directly 
in  the  Pc  could  take  as  short  a  time  as  x/45  jus  if  the  Pc 
were  used  directly. 


We  assume  that  if  the  P. microprogram,  which  is  used  to  define 
the  System/360  ISP,  were  used  to  interpret  a  FORTRAN  ISP, 
the  speed  for  a  Model  25  FORTRAN  ISP  might  easily  approach 
that  of  the  Model  50. 

Model  44 

Model  44  does  not  use  M(ROS),  but  its  Pc  and  Plo  are  hard- 
wired (Models  75  and  91  are  also  hardwired).  The  PMS  structure 
of  the  Model  44  is  given  in  Fig.  5.  Model  44  (and  91)  stand 
out  as  having  better  performance  per  unit  of  cost  than  their 
nearest  neighbors,  which  are  implemented  with  M(ROS),  as  can 
be  seen  from  Table  1.  It  must  be  noted  that  Models  44  and 
91  are  not  strictly  compatible  with  the  360  ISP  since  they  do 
not  process  variable-string  and  variable-decimal-data  formats, 
although  Model  44  options  can  make  it  completely  compatible. 
(Subroutines  will  probably  perform  satisfactorily  for  most  ap- 
plications.) 

The  PMS  structure  of  the  Model  44  (Fig.  5)  is  a  tree.  The 
C('44)  structure  indicates  2-Pio('High  Speed  Multiplexor  Chan- 
nels/HSMPX)  which  are  between  a  P('Selector)  and  P('Multi- 
plexor)  in  power,  since  a  single  physical  P('HSMPX)  with  four 
subchannels  can  behave  as  four  independent  Pio's.  The  orga- 
nization of  the  Model  44  Pc  registers  is  given  in  Fig.  6,  which 
reveals  a  straightforward  implementation.  The  heavy  lines  in 
Fig.  6  indicated  an  ORing  of  register  outputs  to  form  a  single 
data  bus  (usually  16  or  32  bits  wide).  The  16-bit  crossover 
function  box  allows  the  right  and  left  halves  (16  bits)  of  the 
input  to  be  exchanged  when  output.  Almost  all  the  units  are 
registers  (except  the  adders,  parity  generators,  and  ORers).  The 
A,  Ax,  B,  and  Bx  registers  are  used  as  the  M. working  for  per- 
forming instructions,  where  the  x  indicates  an  extension  regis- 
ter used  in  the  64-bit  floating-point  operations.  The  C  register 


Mp 


core;  1  us/w;  8192  ~ 
32768  w;  by/w;  (8,1 
parity)  b/by 


T.console- 
I 

-Stm-,-  Pc 

 Pio ( 'Mul t iplexor  Channel)- 

 Pi  of 


0^11:14;  'High  Speed  Multi^Sf 
[plexor  Channel/HSPMX  J 


Stm   K(«0:63-  y 

K(«0;1) - 


.Pio(«l:'i;  'HSPMX)- 


'only  8  logical  K's 
'see  Figures  II   to  16. 


Fig.  5.  IBM  System/360  Model  44  PMS  diagram. 


570  Part  6     Computer  families 


Section  3     The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


Syslem  Mode) 

Byles 

Wordi 

E44 

32,768 

8,192 

F44 

65,536 

16,384 

Processor  Storage 

G44 

131 ,072 

32,768 

262, 144 

65,536 

Manuol  Dato  Entry  from  System  Control  Panel- 


Reg  for  R2 


V 


GB  »0 


GR  '15  I' 


B  Reg 

B->  Reg 

C  Reg 

*  32 

*  32 

r 

f  32 

-  Dola  Entry 

-  Address  Entry 


FPfi  =  Flootlng-Point  Register 
GR    =   General  Register 


Op    =  Operotion  Code 
SAR   =   Storoge  Address  Register 
SDR    =   Storoge  Dato  Register 
8,4,  32, etc.  =  Bit  width  of  the  ci 


Ooto 
■  Addre' 


21-23,  etc.  =  Bit  numt>ers 

*  Includes  pority 

I  High-Speed  General  Registers 

+  Con  be  disploved  on  system  cc 


Fig.  6.  IBM  System/360  data  flow  in  Model  44  CPU.  (Courtesy  of  fnternationa/  Business  Machines  Corporation.) 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range  571 


is  a  second  operand  register  used  for  arithmetic  and  logical 
operations. 

Model  75 

The  PMS  structure  of  Model  75  is  given  in  Fig.  3.  Models  65, 
67,  75,  and  91  all  use  the  same  basic  Mp('2365;  core).  The  S(n 
Mp:  mP),  which  switches  between  the  n  Mp  modules  and  the 
m  Pc  and  Pio's,  varies  with  model,  however.  C('65)  and  C('75) 
use  a  simple  time-multiplexed  S  in  Pc,  called  the  S('Bus  Control 
Unit/BCU).  This  S  makes  decisions  about  which  P  is  to  use 
which  Mp,  rather  than  having  each  Mp  arbitrate  the  P  request- 
ing service  locally.  When  the  memories  are  all  about  the  same 
speed,  such  an  S  is  all  right;  however,  it  has  severe  limitations 
when  slow  speed  (8  microseconds  for  the  large  core  store)  and 
high  speed  memories  (0.75  microsecond)  are  intermixed.  The 
principal  difference  between  Models  65  and  75  is  that  C('75) 
is  hardwired  and,  depending  on  the  size  of  the  configuration, 
may  have  lower  cost/performance. 

The  simplified  functional  unit  diagram  of  C('75)  (Fig.  7)  is 
more  abstract  than  the  register  interconnection  diagram  of  a 
C('44)  (Fig.  6).  From  this  description  (Fig.  7)  of  the  logic  design, 
one  is  able  to  conjecture  what  is  necessarily  within  the  instruc- 
tion, execution,  variable  field  length,  and  decimal  functional 
units.  The  diagram  is  presented  at  a  nonuniform  level  at  both 
the  PMS  and  register-transfer  levels.  There  is  somewhat  more 
detail  than  in  the  PMS  structure  (Fig.  3).  The  Model  75  is 
possibly  the  first  System/360  to  require  an  intermediate-level 
diagram  between  a  PMS  structure  and  a  register-transfer  dia- 
gram. The  instruction  unit  contains  the  instruction  location 
counter  (part  of  the  ISP)  and  is  responsible  for  obtaining  the 
next  instruction  and  the  operands.  Since  there  can  be  overlap 
in  the  instruction  fetching  process,  this  unit  is  responsible  for 
holding  a  number  of  instructions  and  stores  up  to  128  bits 
(2  double  words)  of  instructions  at  a  time.  The  execution  unit 
and  the  variable  field  and  decimal  units  carry  out  operations 
on  data.  The  execution  unit  processes  floating-point  and 
fixed-point  data. 

Model  67 

The  Model  67  was  introduced  in  April,  1965,  for  the  purpose 
of  time  sharing.  The  entry  was  prompted  by  M.l.T.'s  project 
MULTICS.  M.l.T.  had  ordered  a  GE  645  for  experimental  re- 
search in  time  sharing.  IBM  formed  a  group  for  the  development 
of  a  time-shared  computer  and  responded  with  the  Model  67. 
The  Model  67  is  essentially  a  Pc('65)  with  adequate  S's  for 
multiprocessing  and  a  K  between  Mp  and  Pc  for  multiprogram- 


ming and  memory  mapping.  Because  of  software  uncertainties, 
the  Model  67  ran  as  a  Model  65  in  most  installations  (in  1968). 
The  University  of  Michigan  and  M.l.T.'s  Lincoln  Laboratory,  the 
first  two  customers  having  considered  the  MULTICS  proposal, 
were  instrumental  in  outlining  the  specifications  [Arden,  et  a! 
1966].  Several  67's  have  been  delivered,  and  the  software  con- 
tinues to  evolve  and  be  scheduled  for  completion  (see  Fig.  1). 
Questions  of  costs  per  console  must  wait  until  the  system  is 
stable  enough  to  test  and  evaluate,  although  in  April,  1969 
IBM  considered  the  system  attractive  (operational)  enough  to 
market.  The  most  significant  outcome  of  the  experiment  to 
date  is: 

1  The  hardware  seems  capable  of  supporting  a  straight- 
forward time-sharing  system  [Corbato  et  al.,  1962].  Had 
IBM  first  developed  a  simple  system  based  on  proved 
concepts,  they  would  be  capable  of  undertaking  research 
into  more  complex  systems  like  the  version  to  which  they 
originally  committed  themselves.  (Vendors  should  have 
some  basis  of  actual  operating  experience  before  com- 
mitting a  product  to  market.) 

2  The  problems  of  building  really  large-scale  software  sys- 
tems are  not  fully  understood  yet. 

3  The  idea  of  a  virtual  memory  with  a  large  address  space 
(2-'-w)  is  excellent.  Many  storage  allocation  problems  are 
simplified  by  this  concept.  Unfortunately,  the  system 
software  builders  seem  well  on  their  way  to  filling  such 
a  memory.  Thus  the  new  freedom  allows  relaxation  in 
this  level  of  programming. 

4  There  is  a  problem  of  getting  users  into  Mp.core  so  that 
Pc  can  be  kept  busy.  Thus  a  swapping  system  is  often 
found  waiting  for  Ms. drum  or  Ms. disk  information.  Work 
at  Carnegie-Mellon  University  using  a  Mp('LCS:  core; 
.5  1  mw;  8  by/w;  8  jus/w)  seems  to  indicate  that  a 
large  number  of  users  can  have  adequate  response  from 
the  Model  67  if  the  users  reside  in  core  and  are  not 
subjected  to  swapping  [Lauer,  1967;  Fikes  et  al.,  1968]. 

The  above  items  relate  to  the  software.  The  hardware  (Fig. 
8)  is  interesting  from  several  aspects.  First,  there  are  adequate 
facilities  for  memory  mapping  and  program  segmentation.  This 
general  scheme  is  outlined  in  Fig.  9.  In  the  Model  67  a  user's 
segment  and  page  maps  are  in  Mp,  and  these  maps  point  to 
physical  Mp  blocks  of  the  program.  Each  time  a  reference  is 
made,  the  map  is  checked  for  the  actual  reference.  In  order 
to  avoid  the  accesses  to  Mp  for  each  Mp  reference,  a  K,  with 
an  M(content  address),  is  located  between  Pc  and  Mp  to  trans- 


572  Part  6  |  Computer  families 


Section  3  j  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


One  Byte  Eoch 

mi 

Multiplexor 
Channel 


One  Byte 

9 


Selector 
Channel 


yiic  uy 

? 


Selector 
Channel 


Eight  Bytes 


2365  Processor  Storoge 
(Moln  Storage) 


16  Gen. 
Registe 


Eight 
Bytes 


Eight 
Bytes 


Storage 
Control 


Eight  Bytes 


Four  F  looting- 
Point  Registers 


Eight 
iBytes 


Execution  Unit 


Variable  Field 
Length  ond 
Decimol  Unlr 


Four  Bytes  Eight  Bytes  One  Byte  One  Byte 

*  One  byte  oddress  byposs 


Dofo  Width 

Access/Speed/Rale 

2365  Processor  Storoge 

8  bytes 

.75  microsecond  storage  cycle 

All  models 

2361  Core  Storage 

8  bytes 

8  microsecond  storage  cycle 

All  models 

1  word 

200  nonoseconds 

16  General  regisleis 

Floating-point  registers 

2  words 

200  nanoseconds  word 

4  Floating-point  regisleis 

Addressing  odder 

3  bytes 

200  nanoseconds 

Porollel  odder 

8  bytes 

200  nanoseconds 

Exponent  adder 

1  byte 

200  nanoseconds 

Seriol  odder 

1  byte 

200  nanoseconds 

Basic  machine  c/cle 

200  nanoseconds 

2860  selector  chonnel 

1  byte 

1  .3  million  bytes  per  second 

8  bytes  to  storage 

2870  Multiplexor  chonnel 

I  byte 

1  10  kb  to  450  kb 

8  bytes  to  storoge 

Burst  mode 

1  byte 

50-1 10  kb 

Multiplen  mode 

1  byte 

50-1 10  kb 

Selector  subchonnel 

I  byte 

100  kb,  eoch 

Fig.  7.  IBM  System/360  Model  75  data-flow  diagram  and  system  statistics.  (Courtesy  of  International  Business 
Machines  Corporation.) 


Section  3  I  The  IBM  System '360— a  series  of  planned  machines  which  span  a  wide  performance  range  573 


integrated  circuit;  content 
addressable;  taccess;   150  ns;  8  w; 
address:  20  b:  data:  9  b 


T. console  - 


Mp(#0:7)i  |-K(#0:1;    'Dynamic  Address  Translation)   Pc(«0:l;    ' 2067)-k ( i n i rec t ) 

3)=  stM: 


Mp(#0 


-S(//0:l;    '281|6  Channel  Controller) 


Pio('2870;  #(0:191), (l:M)i 
Pio('2860;  #1:3)- 
Pio('2860;   *^l  :3)  - 


'MpC2365-l2)    :=   (M(#0:l:    '2365-2:    ,75  us/w;   16  kw;  8  by/w;    (8,1   parity  b/by))-S-) 
^Mp('2361-2  Large  Capacity  Store/LCS;   8  ps/w:   taccess:   3,2  js/w;  262  kw;  8  by/w; 
(8,1  parity) 

■'5(8  M;    C)  ~  6)  P;  cross-point;  concurrency:  8;   t. delay:    .1   iS ;  distributed;    location:  M; 
bus) 

"sCl  H;  2  P;  cross-point;  concurrency:  2;  t.  delay:  I  ^s;  distributed;  location:  M;  bus) 
See  Figure  3  for  Model  65. 


Fig.  8.  IBM  System/360  Model  67  PMS  diagram. 


form  a  24-  or  32-bit  virtual  address  in  Pc  into  an  actual  19-  to 
22-bit  physical  address  in  Mp.  This  K  is  not  shown  in  Fig.  9 
because  it  is  not  logically  necessary.  The  scheme  suggested 
in  Fig.  9  uses  control  bits  in  the  map  to  determine  legal  Mp 
accesses.  In  the  Model  67  the  storage  key  mechanism  holds 
whether  a  given  page  can  be  accessed  by  a  given  numbered 
user  (instead  of  associating  the  control  with  the  mapping  as 
shown  in  Fig.  9). 

Second,  the  Model  67  is  the  first  acknowledgment  by  IBM 
of  multiprocessor  computers,  since  it  provides  adequate 
switching  to  allow  multiple  Pc's.  The  C('65)  multiprocessing 
configuration  has  been  introduced  based  on  Model  67  structure. 
Multiprocessors  are  necessary  for  reliability,  not  solely  for  per- 
formance reasons. 

The  PMS  structure  of  C('67)  in  Fig.  8  does  not  have  to  use 
theSCBus  Control  Unit/BCU),'  as  in  theC('65).  TheC('67)  can 
have  an  S  in  each  Mp,  so  that  four  P's  can  communicate  with 
an  Mp,  as  shown  in  Fig.  8.  Each  Mp  makes  the  decision  about 
the  P  request  to  be  honored  next.  Thus  the  problem  of  having 
an  "all  knowing"  S('BCU)  is  solved  by  allowing  each  Mp  to  do 
local  scheduling,  rather  than  having  a  dialogue  with  another 
component  (with  time  delays).  The  S('BCU)  in  a  duplex  C('67) 
is  still  present,  but  with  less  power,  in  the  form  of  the  S('2846 

'A  system  with  only  one  port  at  Mp.  controlled  by  BCU,  is  called  a  simplex.  A 
system  with  multiport  Mp  is  called  a  duplex. 


Channel  Controller).  It  is  used  to  arbitrate  the  Pio  accesses  to 
Mp. 

Without  multiprocessing,  the  Pc  seems  very  badly  mis- 
matched with  respect  to  Mp.  Consider,  for  instance,  the  data 
rates  on  the  C('67).  From  Fig.  8  its  maximum  possible  Mp 
data  rates  are: 

For  1  Mp('2365-12): 


2  X  64  bits 


171  mega  bits /sec 


0.75  ;:iS 

and  for  1  Mp('2361  Large  Core  Store): 
^'^^   =  8  megabits/sec 

O  flS 

Thus  the  total  data  rate  is 

171  X  8  -f  8  X  4  =  1,368  +  32  megabits/sec 
=  ^1,400  megabits/sec 

The  processing  rate  is  approximately 

^"^^^'^^   =  29  megabits/sec 
2.2  (US 

An  Ms. drum  rate  is  approximately 
10  megabits /sec 


ps 


574  Part  6  |  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


Logical  (virtual )  address  from  processor 


Segment 

Page  number 

Word  (cell) 

number 

Within  segment 

within  page 

Processor  component 

User  segment  table  register 


Segment 

table 

length 


Segment  table 
origin 


0"" 


Segment  toble^ 


Segment 

table 

length 


Page  table  length 

Origin  of  page  table 

Page  tables  for  segments^ 


Page 
tobte 
length 


Control' 

Origin  of  page 

1 

Address  translation  (user  maps) 
Primary  memory  component 


Physical 

Word  (cell) 

page 

within  page 

J 

"+"an  addition  operation 
'  access  ond  octivity  information  ( read  .write. read  only, etc.). 
^  located  in  primary  memory  during  execution 


Fig.  9.  Memory  allocation  using  pages  and  segments. 


Thus,  for  the  several  P's,  an  effective  Mp  request  rate  of  100 
megabits/sec  might  be  needed.  The  data-flow  mismatch  (be- 
tween Mp  and  the  P's)  occurs  because  of  the  P's,  the  S  (the 
L's  connecting  P  and  Mp),  the  lack  of  P's,  and  the  fact  that 
t. access  =  —  '/^  t. cycle. 

The  Pio('2870),  used  in  Model  65  and  above,  is  described 
at  two  structural  levels  in  Fig.  3.  The  Pio  includes  a  large 
M. working  to  store  the  state  of  each  of  the  logical  Pio's.  This 
Pio  state  includes  the  instruction  location  counter,  the  control 
state  bits  (active,  running,  interpreting  an  instruction,  process- 


ing data,  etc.),  and  buffering  (one  8-byte  word).  By  having  an 
M. buffer,  the  demands  on  Mp  from  the  Pio's  are  reduced  by 
a  factor  of  8.  Although  the  expected  data  rate  from  many  K's 
does  not  require  the  extra  M,  there  are  possible  times  when 
the  uncertainty  of  the  access  times  for  Mp  might  cause  data 
loss.  Since  the  M. working  is  necessary  to  store  the  Pio  state, 
the  additional  space  for  buffering  is  not  expensive.  An  alterna- 
tive design  might  use  Mp  for  this  buffering. 

The  four  Pio('2860  Selector  Channel)'s  are  implemented  as 
independent  Pio's,  using  conventional  hardwired  logic  and 
buffering.  However,  they  are  packaged  as  one  unit. 

Model  85 

The  Model  85  was  announced  in  February,  1968,  with  the  goal 
of  being  the  highest-performance  Model  360  in  production.  The 
performance  is  ^(3  ^  5)  times  the  Model  65  and  in  some  cases 
outperforms  a  Model  91  [Conti  et  al.,  1968]. 

The  PMS  diagram  of  the  Model  85  is  shown  in  Fig.  10.  The 
Pio,  T,  Ms  structure  is  identical  to  that  of  Models  65  and  75 
(Fig.  3).  The  two  interesting  aspects  of  the  structure  in  Fig.  10 
are  the  M(content  addressable;  'Buffer  Storage;  16|32  page; 
1024  by/page)  and  the  Pc.  The  pages  are  filled  in  groups  of 
64  bytes,  as  references  to  a  particular  physical  block  in  Mp.core 
are  made.  Conti  [1968]  gives  running  times  for  various  pro- 
grams as  a  function  of  buffer  memory  size.  Multiprogramming 
may  degrade  the  performance  more  than  any  other  case.  This 
process,  which  has  been  referred  to  as  "look  aside,"  or  a  "slave 
memory,"  was  suggested  by  Wilkes  [1965].  It  is  completely 
analogous  to  the  Model  67  M(content„addressable;  8  w)  which 
is  used  to  hold  the  segment-page  map  for  a  multiprogrammed 
time-sharing  system.  It  is  also  analogous  to  a  one-level  storage 
system  (Atlas;  see  Chap.  23)  which  is  formed  from  two  physical 
M's  whose  performance  differs  significantly.  Here,  the  effect 
is  to  try  to  approximate  a  computer  with  a  large  Mp(80  ns/w) 
by  using  a  large  Mp(l  jus/w)  and  a  small  Mp(80  ns/w).  The 
CDC  7600  (page  475)  has  a  similar  structure,  but  the  Mp-Ms 
migration  is  under  programmed  control. 

The  P. microprogram  used  for  controlling  the  Pc(K('Exe- 
cution  Unit))  allows  for  great  flexibility  in  the  definition  of  ISP's. 
An  Mp(500  w)  is  available  for  the  user;  this  may  be  loaded  by 
a  program,  and  it  specifies  an  ISP.  One  standard  option  is  to 
emulate  the  704-7094  series. 

The  Model  85  removes  the  restriction  of  aligning  words  at 
particular  boundaries.  Thus  any  logical  word,  independent  of 
its  length,  can  be  located  at  any  physical  location  addressed 
in  bytes. 


Section  3  I  The  IBM  System/ 360— a  series  of  planned  machines  which  span  a  wide  performance  range  575 


Mp^  SC'Storage  Control)- 

M('Buffer  Storage)* 


•T(«l:3)  L(/'l:3)=  — 

L(in:   16  by:  out:    (8,I6)by)  Pc"^  T.console^ 

I   H'OIrect) 


^MpCcore;  ('|85;  M(#l;2;  '2365-5;  l.O'i  ps/w;  262  kby) )  |  ( '  J85;  M(  #l:li;  '2365-5; 
l.C  us/w:  262  kby) )  |  ( '  K85/2385  Model  1;  .96  ps/w;  2mby)|  ('L85/2385  Model  2; 
.96  us/w;  4  mby)  ;  M  (' Protect  ion  Key  Storage  Elements:  128  ^102'<  w;  6  b/w) 
(16  +  error)  by/w;  8  b/by;  ?'""<77*  ewer  ieterrtion  okc  covrecticn^  double  error 
detection) 

^L(tfl:3;  Pio('2870  Multiplexor  Channel)^,  Pio(#l:2;    '2860  Selector  Channel)^; 

8  by;    (8.1   parity  b/by)) 
^See  Figure  3  for  Model  65  and  75. 

"M.bufferCBuffer  Store;   integrated  circuit:    (I638I1  ~  32768)  by;  80  ns/w;  content 

addressable;  data:   102li  by;  address:  9~  12  b) 
^T. console ( (CRT;  display),  keyboard,   (microfiche:  reader)) 

;  3  by)  Mps(l6  w;  If  by/w)—, 


'Pc  :  = 


D (operation:  +; 
-K( ' Instruction  Uni  t) 
M.bufferl  instruction; 

16  by/w 


Dfoperat I  on 


n:  +,A,V,  -27|  - 
,11,8  by  J 


.  Hps  d 


by/w) 


f-  K ( ' Execut 1  on  Unit 
— M, buffer 

— M. parameter (read  only;  80  ns/w: 
— M. parameter (read  write:  80  ns/w 


2000  w) 
500  w) 


C . mi  c  roprog rammed 


Fig.  10.  IBM  System/360  Model  85  PMS  diagram. 


The  Pc's  data  operation  performance  is  impressive.  A  fixed- 
point  multiply  is  done  in  0.4  fis,  and  a  floating-point  multiply 
takes  0.56  (ns  (not  including  accesses). 

The  data-type,  extended  floating-point  number,  is  used  in 
Model  85.  Thus  a  24-,  56-,  or  112-bit  fraction  part  can  be  used. 

Model  91 

This  model  has  a  very  low  cost/ performance  ratio  (see  Table 
1).  Only  about  20  Model  9rs  were  produced  before  it  was 
withdrawn  from  the  market.  It  has  the  highest  performance  of 
the  series.  The  Mp  is  0.75  /is,  but  16  are  overlapped  to  provide 
a  theoretically  maximum  bandwidth  of  16  x  64/0.75  =  1,370 
megabits/s.  About  2.5  mega-instructions/s  are  executed;  thus, 
a  total  of  160  megabits/s  of  Mp  are  absorbed  by  Pc. 

There  are  other  interesting  models  in  the  '90  series:  the 


Model  92  was  a  paper  machine,'  and  the  Model  95  was  unan- 
nounced but  produced,  a  version  of  the  Model  91  with  an  Mp(in- 
tegrated  circuit:  60  ns/w;  8  by/w).  The  Model  91  is  not  covered 
in  any  detail  here  because  of  space  limitations.  It  is  similar  to 
other  very  large  computers  in  that  many  techniques  are  em- 
ployed to  obtain  parallelism.  The  January,  1967,  IBM  Journal 
of  Research^  is  devoted  to  design  issues  of  the  Model  91. 

Models  1130  and  1800 

These  computers  are  presented  as  reference  points  and  have 
nothing  to  do  with  the  C('360).  They  are  implemented  outside 
the  System/360  framework  but  use  its  technology,  and  so  cost 
comparisons  are  still  somewhat  meaningful.  These  computers 

'See  bibliography  at  the  end  of  this  chapter. 


Part  6  j  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


-L(CCPio)) 

K(iChannel   to  Channel  Adapter- 

I   used  to  transfer  data  among  2  C^s) 
-L(C(Plo)) 

a.  Interconnection  of  2  computers   (or  within  a  comDuter) 
for  transmission  of  Information 

— L (S ( ' Sel ector  Channel; 

j  lisec^  in  rlaae  of  regular  channel)) 

P(block  transfer;    'Storaqe  to  Storage  Channel) 

b.  Processor  for  the  transmission  of  information  (vectors) 
wi  thin  Mp 


-L(Pio)-,— S  —  K(#A;    '2903  Special   Control  Unit/SCU)-Xi 
I 

S 

—  S  —  K(#B;  'SCU)-X'' 
interconnection  to  other  controls  and  computers 

- L (S (' Selector  Channel,  Models  kk ,  65,  75; 

usB'^  in  place  of  regular  Channel) 

{array:    '2938:  microprogrammed:  Mps(-«64  w;   32  b/w)  : 
operations:   (vector  move,  vector  multiplication, 
vector  inner  product,  sum  of  vector  elements,  sum  of 
squares,  convolution,  difference  equation,  fixed  float- 
ing conversion):   data   lengths;  scalar,  vector,  matrix; 
data-types:   fixed,  floating) 

Array  Processor 


(C|K|T[Hs) 


are  straightforward,  and  for  a  given  task  which  does  not  use 
floating-point  arithmetic,  they  should  perform  as  well  as  any 
System/360  model.  The  arguments  we  use  for  the  intermediate 
Pc  for  the  Model  25  apply  equally  well  here,  too.  Namely,  why 
have  such  a  complex  ISP  when  simple  ones  will  do  just  as  well? 

The  programmed  floating-point  arithmetic  times  for  a  4-|us 
1800  and  the  "hardwired"  (microprogrammed)  System/360 
Model  30  are  compared  in  Table  2.  We  would  expect  the  2-/iis 
1800  to  be  better  by  a  factor  of  2.  Note  that  the  times  are  about 
the  same  for  Model  30  and  the  slower  1800.  The  cost/perform- 
ance is  especially  low  with  the  1130  (Table  1).  In  Chap.  33  we 
discuss  the  1800.  It  is  interesting  to  speculate  why  the  1130 
and  1800  cannot  be  implemented  within  the  System/360  frame- 
work. Are  they  "loss  leaders"?  Are  they  in  response  to  more 
sophisticated,  performance-oriented  users? 

The  PMS  structure  of  the  controls,  terminals,  secondary  memories, 
and  special  processors 

There  are  many  common  components  which  attach  to  the  C's 
(Figs.  11  to  17).  Most  of  the  components  which  attach  to  a  Pio 
are  not  especially  interesting,  but  they  give  an  idea  of  the 
behavior  and  parameters.  For  example,  the  expression  T('1403 
Model  3;  line;  printer;  1100  line/min;  132  char/line;  8  bits/ 
character;  64  —  240  character  set)  pretty  well  describes  a 
typical  line  printer.  From  the  above  description  one  can  de- 
duce the  data  rate  of  a  T(line  printer).  It  is  132  char/line  x 
1100  line/min  x  "/g,,  min/s  x  8  b/char  =  19.4  kb/s. 

The  channel-to-channel  adapter  control.  The  most  interesting 
group  of  components  (outside  the  C  structures)  are  the  special 
components  shown  in  Fig.  11.  The  K('Channel  to  Channel 
Adapter)  allows  two  P's,  either  on  the  same  or  a  different  C, 
to  communicate  with  one  another.  This  K  is  used  in  the  con- 


Table  2  IBM  1800  (4  ;us)  and  IBM  System/360  Model  30  floating- 
point arithmetic  timing 


Operation  times  (j^s) 

Operation 

1800  (4  lis) 

System/360  Model  30 

+  {sf);  -Kdf) 

460;  440 

75;  115 

x{sf};  {df} 

560;  790 

320;  1060 

^{sf) 

766 

600 

4500 

2965 

sin  {f} 

3000 

3876 

exponential  ff} 

2000 

4173 

Fig.  11.  IBM  System/360  special  P's  and  K's  PMS  diagrams. 

struction  of  a  dual  C  system  or  the  N('Attached  Support  Proc- 
essor/ASP)  in  Chap.  40,  page  506.  A  C('40|'50)  is  attached  to 
a  C('55 1  '75).  The  C('40 1  50)  is  used  as  a  Cio  with  file  processing 
capabilities.  The  K  has  M. buffer.  Data  can  flow  in  only  one 
direction  at  a  time. 

The  special  control  unit.  The  K('2903  Special  Control  Unit/SCU) 
consists  of  two  independent  K's  which  are  physically  packaged 
together  and  allow  users  to  interface  with  the  Pio's.  Although 
it  has  not  been  discussed,  the  actual  interconnection  with  a 
Pio,  via  the  S(Pio;  K),  is  via  a  physical  I  /O  bus  which  is  arranged 


Section  3  |  The  IBM  System  360— a  series  of  planned  machines  which  span  a  wide  periormance  range  577 


in  a  bus  (or  chained)  fashion.  Such  a  single  interface  to  handle 
a  wide  range  of  needs  (high  and  low  response  and  data  rates) 
via  a  single  set  of  electrical  conductors  requires  a  great  deal 
of  control  information  to  be  passed  along  the  link.  Therefore 
a  K  must  have  a  great  deal  of  knowledge  of  the  dialogue  in 
order  to  communicate.  The  hardware  to  attach  to  the  I/O  bus 
at  a  K  is  costly  and  must  be  designed  carefully.  The  K('SCU) 
provides  a  rather  simplified  interface  to  the  Pio.  All  I/O  bus 
synchronization  control,  communication  protocol  control, 
buffering,  and  electrical  isolation  are  within  K('SCU).  The 
K('SCU)  is  fairly  flexible,  in  that  devices  connected  to  it  can 
communicate  with  one  another  without  Pio  (see  Fig.  11). 

Sloru^c-tii-storaiic-channel  processor.  The  P('Storage  to  Storage 
Channel)  is  a  special  processor  which  performs  the  sole  function 
of  transferring  data  blocks  (a  word  vector)  between  one  location 
m  Mp  to  another  in  Mp.  It  qualifies  as  a  P,  since  it  takes  an 
instruction  from  Mp  containing  the  location  and  length,  and 
once  the  instruction  is  executed,  another  is  fetched  and  exe- 
cuted (if  it  exists).  Thus  the  component  has  a  well-defined 
interpretation  cycle  and  set  of  operations.  This  P  is  useful  in 


a  multiprogrammed  environment  requiring  programs  to  be 
moved. 

The  293S  array  processor.  The  P.array('2938)  is  an  extremely 
interesting  special  P  (Fig.  11).  It  can  be  connected  to  Models 
44,  65,  or  75.  It  has  a  limited  instruction  repertoire,  but  the 
instructions  it  interprets  are  more  complex  than  those  in  the 
ISP  of  the  Pc.  The  instructions  are  algorithms  for  operating  on 
an  array  (a  vector  or  a  matrix).  These  instructions  include: 

1  Vector  move,  similar  to  the  P('Storage  to  Storage)  de- 
scribed above,  with  conversion  either  way  between  fixed 
and  floating  point 

2  An  element-by-element  vector  sum 

3  An  element-by-element  vector  multiplication 

4  A  row-by-column  vector  inner  product 

5  A  convolution  multiply 

6  The  solution  to  a  step  in  a  difference  equation 

The  P. array  is  microprogrammed,  using  an  M(ROS),  which 


.L(l»l  :2)'  Sfx  K('2'<'ll)  Sfx 


(/'l:^:   '2311  Disk  Storage  Drive-  removable 
moving  head  disk;  taccess:   ((0  —  25)  -•- 
(0  ~  135)  ms);  156  kby/s;  7.25  nwgabyte') 
l*!:!);    '2302  Disk  Storage;  moving  head  disk; 
taccess:    ((0~3'l)■^   (0  ~    ( 50  ~  1 80) )  ns )  ; 
J  56  kby/s;  112 . U  megabyte 

2303  Drum  Storage;  taccess:   (0  ~  17. s' 
312  kby/s;   7.82  mby 


;  (7^1:2; 
_[rris):  31 


I        T|  (,«1  :8;    '2321  Data 

Cell   Drive;  taccess; 
(0  ~  50)  -v   (.1  ~  650) 
ms ;   55  kby/s 


_S- 


-Ms 


SI ;10;   'Data  Cell 
removable;  mag- 
netic card;  40 
megabyte;  area; 
2.25  X   13  In^ 


—  L(#I;2)2  Sfx  K('2S20)   Ms 

—  L(#l;2)=  Sfx — K  S(fx;  8Ms)-M5 


k;   '2301  Parallel  Drum:   taccess:  (0~17.5) 
1.2  mby/s:   ll  mby;    (8.1   parity)  b/by 

9;    '23II4  Direct  Access  Storage  Facility: 
removable;  moving  head  disk;   taccess:  ((0—251 
^   (0  ~   135)  ms);   312  kby/s;  26  megabyte; 
(8,1  parity)  b/by-  only  8  selectable  units 


'-L(Pio((' Selector)  ]  ('Multiplexor))) 
2-L(Pio('Selector))- 


Fig.  12.  IBM  System/360  IVIs(drum,  disk,  data  ceil)  PMS  diagrams. 


578  Part  6  |  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  macfiines  which  span  a  wide  performance  range 


-K('2'll5)   Sfx_Ms 


('2^15;  magnetic  tape: 


18 

75 

in/s;  area:    (.5   'n  ^ 

1  800 

ft) 

{mode  1 

«:  by/in;   b/by) :  ( 

(1 

2 

200,556,800;  (6+1) 

.  (8+1 

))l 

(2 

If 

200,556,800:  (6+1) 

,  (8+1 

))| 

(3 

6 

200,556,800:  (6+1) 

,  (8+1 

))| 

ih 

2 

200,556,800,1600: 

(8+1) 

)| 

(5 

1. 

200,556,800, 1600; 

(8+1) 

)| 

(6 

6 

200,556,800, 1600; 

(8+1) 

)) 

-  L   KC2802)— Sfx»  


K('2ll03)   :=  ( 
-  L          K(  '2803) — Sf)^ 


/(l  :8;   '73I1O-3  Hypertape; 
addressable  magnetic  tape; 
170|3''O  kby/s;   1511|3022  by/in; 
112.5   in/s:   1800  ft:    (8,2  parity)  b/by 
error"  correction 


-H5(«l ; 


K('240'4)    :=  ( 

-  L         K('280'4)— Sfx 


1—  Ms  (^2:  R 

; 


2I.OP  I  '21)02=) 
'21(01' 


21102-*)  -  ) 
Hsin  :    '2l)0P  I  '2ll02^  )  - 


-  L(#l:2)  Sfx 


-  L(#l :2) - 


.  M5(«2:8;    '21(01=  |  '21)07'  )   -  ) 

-  K(  '2803)  Sfx"* —  Hspi  :8;    '21)01=  [  '21)02=  • 

[magnetic  tape 


<pl  :2;~ 
L'280l), 


Ms  rSl         '21)01'  j  '21)02* 
[magnetic  tape 


.out  :J 


^  -  L  ( to:  PioC  Selector  I  'Multiplexor)) 
'21)01;  maqnetic  tape:  a  rea  :  ( .  5 
(model:   in/s;   by/in:  b/by);( 


ISOO  ft) 


(1 

37. 

5:  200.556,800:  (6+1 ) , (8+1 ))  | 

(2 

75; 

200,556,800;    (6+1), (8+1))  1 

(3 

1 12 

.5:   200,556,800:    (6+1)  ,  (8+0)  | 

(I) 

37. 

5;  200, 556, 800, 1600; (8+1 ))  | 

(5 

75; 

200,556.800,1600;    (8+1))  ] 

(6 

112 

•5-   200.556,800,1600;  (8+1))) 

-■Ms  C  21)02)    :  = 

(Ms(/'1;2;    '21)01;  magnetic  tape  unit)) 

(S{fx;  1  K:  8  Ms)|s(fx;  2  K:  8  Ms:  concurrency: 
S(fx;  k  K:   16  Ms:   concurrency:  k)) 


2)1 


Fig.  13.  IBM  System/360  IVls(magnetic  tape)  PMS  diagrams. 

makes  It  possible  to  construct  complex  algorithms  in  a  flexible 
manner.  The  hardware  logic  is  capable  of  doing  a  combined 
floating  point  multiplication  and  addition  in  200  nanoseconds. 
The  impressive  results  this  P  achieves  in  the  interpretation  of 
the  algorithms  are  principally  because  the  time  to  access  the 


algorithm  has  gone  to  zero.  A  measure  we  might  apply  to  a 
P  is  the  ratio  of  the  time  it  spends  fetching  the  algorithm's  data 
to  the  total  time  it  spends  executing  the  algorithm.  In  a  con- 
ventional computer  Pc  we  suggest  that  a  ratio  of  nearly  V2  is 
very  good.  Two  fetches  are  usually  required— one  for  data,  one 


Section  3  |  The  IBM  System/ 360— a  series  of  planned  machines  which  span  a  wide  performance  range  579 


  L':          KC281)8)  - 


M(buffer:   \(:3RI<  by) 
-K('28i)0-1)  Stir 


M(buffer;  analog; 
■('7770  Aud! 

Response ; 

ana  1 og 


'7772  Audio 
Response  j 
from : H  i  a  i  ta ! 
to :  ana  log 


-T  *1  ■.2't;    '2260  Display 
Station;    (CRT;  display: 
area:  Ci  X  9  in  ):  960 
char/page;  80  char/line: 
30  page/s;  6^+  symbols/ 
_char  )  :(l<eyboard;  input). 

-TC#1;2'4;  typewriter  printer) 


1  :6; 


'2250-2; 
area: 


(CRT: 
12  X  12 


isplay 
2 

n  /page:   I02ll  y  ]D2h 


point/page);  (keyboard; 
input) 

-T(*l:6;   light;  pen;  input)— 
-Tp2280;  film;  writer;  35  ~| 
[ran;  l4096x  14096  point/paged 
.T( '2281 ;  film;   reader  35  mm] 


32  - 
-Sfx 


128  words) 

1^1:^8;  Telephone  line; 
L_analog:  speech 


t 

-iTfl  :8; 
[_ana  1  o 


Te 1 ephone  1 i 
log;  speech 


:=     (L(Pio('Selectorl 'Multiplexor)) 
Dataphone) ) 
'  L (Pio ( 'Sel ector I'Mul t  i  plexor) ) 
^L(Pio('Multiplexor)) 


L((1200  ~  I18OO)  b/s; 


for  the  instruction.  This  P  has  a  ratio  near  one,  as  it  is  always 
accessing  cJata  (and  rarely  instructions). 

Secondanj-memonj  structure.  Figures  12  and  13  present  the  Ms 
PMS  structures.  All  the  K's  have  an  optional  S,  which  can  be 
placed  between  the  K  and  the  S(P;K)  to  allow  two  Pio's  to  access 
a  common  K  (from  either  of  two  C's  or  two  Pio's  of  the  same 
C).  The  K('2841  Storage  Control)  is  interesting  only  in  being 
able  to  control  a  series  of  quite  disparate  devices,  on  a  one-at- 
a  time  basis. 

Figure  13  presents  all  the  M(s;  magnetic  tape)'s.  The 
switch  is  interesting  as  it  can  be  used  for  up  to  four  K's  to 
access  simultaneously  any  of  16  M. tapes.  (The  vast  array  of 
very  similar  devices  is  due  undoubtedly  to  marketing  rather  than 
production  or  engineering  reasons.)  It  should  be  noted  that 
there  are  two  distinct  M. tapes:  conventional  magnetic  tape  and 
Hypertape.  Hypertape  is  explicitly  addressed  and  has  built-in 
error-correction  coding. 

Terminiil  structure.  Figure  14  shows  the  T(cathode  ray  tube; 
display)  and  T(audio:  output).  There  are  terminals  for  writing 
and  reading  from  photographic  film  (35  mm).  The  two  ap- 
proaches used  for  audio  (vocal)  output  are  noteworthy.  One 
uses  an  M.drum  to  record  a  fixed  vocabulary  of  words:  the  other 
uses  an  encoding  mechanism  to  allow  digital  information  stored 
in  Mp  to  be  transferred  via  the  K('7772  Audio  Response)  to 
transforming  a  coded  voice  back  to  an  audio  output  form.  The 
S  at  the  output  of  the  T(audio)  provides  for  audio  signals  to 
be  switched  on  a  word-by-word  basis  to  any  of  several  output 
telephone  lines. 

The  structure  of  the  vast  array  of  printing  devices  that  can 
attach  to  the  C('360)  is  shown  in  Fig.  15.  Some  of  the  devices 
are  interesting,  such  as  the  one  that  reads  pencil-marked  or 
typewritten  paper.  The  main  parameters  of  significance  to  PMS 
are  the  rate  the  device  reads  paper  together  with  the  kind  of 
paper. 

The  T  and  K's  which  connect  to  external  processes  are  given 
in  Fig.  16.  The  K('1827)  is  used  to  connect  with  analog  proc- 
esses and  is  actually  part  of  the  IBM  1800  computer  system 
(Chap.  33).  The  other  K's  are  important,  though  not  especially 
interesting,  since  they  provide  the  K  to  T(Teletypes),  K(tele- 
phone  lines),  and  T(typewriters).  The  K('2701)  and  K('2702) 
are  built  to  transform  unsynchronized  parallel  data  from  the 
C  into  the  synchronized  serial  form  required  by  the  telephone 
line.  The  K('2701)  controls  a  small  number  of  lines  of  high  data 
rates;  the  K('2702)  controls  a  large  number  of  lines  at  low  data 


Fig.  14.  IBM  System/360  T(audio,  display)  PMS  diagrams. 

rates.  The  K('2702)  is  actually  an  array  of  up  to  31  K's  that 
are  time-multiplexed,  using  an  M.core  to  hold  the  state  of 
each  K. 

Peripheral  switching.  For  performance,  communications,  and 
reliability  reasons  it  is  necessary  to  provide  access  to  K's,  M's, 
or  T's  from  several  C's  or  Pio's.  A  sample  structure  of  a  pos- 
sible configuration,  using  the  above  components,  is  given  in 
Fig.  17.  The  PMS  diagram  also  shows  the  physical  structure  of 
S(from:Pc;  to:K). 

Performance  and  costs 

The  System/360  series  is  perhaps  the  only  group  of  computers 
for  which  a  valid  comparison  of  performance  and  cost  can  be 


580  Part  6  [  Computer  families 


Section  3  j  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


-KTC  I'|l42-N2;  card;  punch;   160  col/s)-* 

-KT{ '  lliW-Nl  ;  card;   (reader;  ^00  card/min),   (punch;  160  col/s):  half  duplex)- 
-KT(card;   reader,    ('2501-Bl;   600  card/mi n) 1 ( '2501  - B2 ;   1000  card/m!n))<- 
-KT('2520-B1;  card;   reader,   punch;  500  card/min;  half  duplex)- 
■KT('2520;  card;  punch;    ('model   B2 ;  500  ca rd/mi n) | (Irodel   83;  300  card/mi n) 


■ K('2821)- 


2671-1;  paper  tape;  reade 
5,6,7,8  b/char;  area:  ~  1 


r;  1  kchar/s;"!*- 
X   .1  In^/charJ 


-K('2821  )— S(3T)- 


#1:3;  'l'i03;  line  printer;  chain 
('Model;  line/min;  col/line):  (2 
600;  132) ! (3;  1 100;  132) 1 (7;  600 
120)|(N1;  1100;  132;  48 .96 . 1  ll^ ,  192  , 
2l|0  symbol /char ; 

'\kOh  Bill   Feed;  Printer  Model  2;|- 
_600  lines/min;   132  col/line 
2540  card   (reader;   1000  card/i 


.:.>,-l 

Dlex  J 


Upunch;   300  card/m!n);   full  duple 
-KT('1053;   character;  printer;   14.8  char/s)-' 


■  KTp 1 231 -Nl ;  optical;  pencil  mark  page;  reader;  area:   (8.5  x  11)   in  /page; 
1.8  s/page  -< 


-KTpl285;  optical;  printed  character  roll  paoer;  reader;  width:   (.9375  ~  3-5) 
22  char/col;  300  char/s 


KTr'1287  Models   1   and  2:  optical;   reader;  handprinted;   roll,  document: 


■  KT 


3:    (2.25  X  3  in')  |(5.91   x  9  in^) 

'I4l8,  1428  'Models  1.2,3;  optical;  typewritten  character;  reader;  area 
(2.75  X  3.66  in^)!(5.875  x  8.75        )l(2.33  x  4.18  in^)|(3  /  8.75  in^); 


]- 


8      420  docunents/min 
-KT('l445  Printer~Nl;  magnetic  character  lir 


rinter;   190,240,525  lin/min)^ 


—  L          Kirmagnetic;  character;  reader;  bank  checks;    ('1412;  950  documen t/mi n ) | ( ' 1 4 1 9 ;  I 

|j  600  document/mi  n)  J 

'  L (pio { ' Selector  I 'Mu 1 1  i  plexor) ) 


Fig.  15.  IBM  System/360  T(printer,  reader,  punch)  PMS  diagrams. 


made.  The  models  use  essentially  the  same  technology,  imple- 
ment the  same  ISP,  and  are  probably  constrained  by  a  common 
corporate  profit  goal.  Even  here,  as  we  noted  earlier,  compari- 
sons are  difficult  to  make. 

In  Table  3  we  present  the  costs  for  various  PMS  component 
primitives.  From  this  table,  costs  (relative  to  other  components) 


can  be  obtained.  These  costs  are  expressed  as  dollars  per 
second  ($/s)  to  rent  the  equipment.  They  have  been  derived 
from  the  IBM  monthly  rental  prices.  The  computer  prices  are 
based  on  estimates  of  minimum,  average,  and  maximum  con- 
figurations in  the  Adams  Computer  Characteristics  Quarterly 
[Adams  Associates].  The  conversion  factors  are 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range  581 


—  L  K(#l  :2;  '2701  )  S  K(/'l;2)=  L(#l:2;  full   duplex,   telephone  line) 

—  L(#l:2)=         Stm  l<('2702)  Stm  T(*l;31)  - 


Mfbuf  ferH 
y  w  J 


■K('1827)- 


•  L  (Dataphone  ;  digital;  -'tart  stop  corttyoZ)- 


(#1:31:  Telephone  Line; 
50  ~  600  b/s;  .-^t^yt,  step, 
asynchronous;  to:  T(Dataphone) 

-T(analoq:   input,  output)  - 


'27^0' '27^1  Communications 
Terminal;   typewriter:    133  b/s; 
U.S  char/s;   9  b/char; 
(kli  '   2)  symbol/char 


!»1  :1'4; 
I3'<.5  b/s: 
9  b/char 


•sr2712  Remotel- 
[mu 1 t  i  p 1 exor  J 


ir~  2  kb/s;  ~|  — Sr'271 
|_ful  1   duplexj       [muI  t 


—  Sp2712  Remote"] —  L 
plexor  J 


«  1  :  1 1) ; 
\-ih.S  b/s 
9  b/char 


' L  (P  io (' Se lector  I  ' Mul t  i  plexor) ) 

'L(Pio('Mul tiplexor)) 

:=   (KT('Bit  Synchronous  Data  Adapter;   1.2~    liO.S  kb/s)  | 
KTCTelephone  Line  Adapter:  0~  600  b/s) 
KTl'Parallel  Data  Adapter;   (16^  148)  b/w)) 


Fig.  16.  IBM  Systern/360  T(telephone  line,  analog,  typewriter)  PMS  diagrams. 


$/s  =  1/[(173.3  hour/month)  x  3,600  s/hour] 
=  1.6  X  10  '■  $/month 

$/month  =  0.625  x  10"  $/s 

The  cost  to  buy,  in  dollars,  is  approximately 

$  =  45  X  ($/month) 

I        $  =  45  X  0.625  X  10''  ($/s)  =  2.82  x  10"  x  ($/s) 

Table  1  is  written  as  a  single,  large  PMS  expression,  thus,  the 
attributes  are: 

Pc(cost:  ($/s]$)) :  =  c.Pc  :  =  cost  of  Pc  alone 
MpCcost.avg)  :  =  c.Mp.avg  :  =  cost  of  average-size  Mp  for 
a  model 

C(cost.min:) :  =  c.C.min  :  =  cost  of  minimum-size  com- 
puter configuration 

C(cost.avg:)  ;  =  c.C.avg  ;  =  cost  of  average-size  computer 
configuration 

Priman/  memory 

The  graph  of  Fig.  18  gives  the  Mp  costs,  c,  (in  $/s)  versus 
memory  size  (information/ i).  The  line  i  =  1.43  x  10"  x  c  is 


Pdoi^A)  P(io;«B)  P(io:/'C)  P(io:l!'D) 

I-  I  I  I 


'  System/360  I/O  Interface  Bus 
=  X   ;=  (TiMs) 


Fig.  17.  IBM  System/360  peripheral-switching  PMS  diagram. 


Part  6  {  Computer  families 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


Table  3    IBM  System/360  component  costs 


Component 


Cost  ($/s) 


Mp  (core:  cost:  $/(kby  x  s)) 

Mp  ('Large  Capacity  Storage  LCS; 

cost:  $,  (kby  x  s)) 
Pc  ('20|25|30|40|44150i65|67| 

75|85|91) 
P.array  ('2938) 
Rio  ('2860) 
Pio  ('2870) 

Ms  ('2415;  magnetic  tape) 
K  ('2415) 

Ms  ('2401:  magnetic  tape) 
K  ('2803 1 2804) 

Ms  ('7340  Hypertape) 
K  ('2802) 

Ms  ('2311;  removable  disk) 
K  ('2814;  #1:8) 

KMs  ('2314;  #1:9.  removable 
disk) 

Ms  ('2321  Data  Cell) 
K  ('2814:  #1:8) 

Ms  ('2303:  drum) 
K  ('2814:  #1:8) 

Ms  ('2301:  drum) 

K  ('2820) 
S  ('2816;  Ms.magnetic^tape;  K) 
T  ('2741;  typewriter) 
T  ('2260:  display) 

K  ('2848;  #1:8,  16,  24) 

KT  ('2250;  display) 

T  ('2761;  paper  tape;  reader) 
K  ('2822) 

KT  ('7772  7770:  audio) 

T  ('1403/1404  line;  printer) 

K  ('2821;  #1:3) 
KT  ('14431 1445:  line;  printer) 
T  ('2540;  card;  reader] punch) 

K  ('2821;  #1:3) 

KT  ('144212501 12520;  card: 
reader!  punch) 

K  ('2701  Data  Adapter) 

K  ('2702;  typewriter:  Teletype) 


II 
III 


I  I 


2  4 


2  4 


2  4 


0.0001 


0.001 


0.01 


0.1 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range  583 


10,000,000  Mp(i:(by)) 
22      23      24  Mp(i;log2(t>y)) 


Fig.  18.  Graph  of  IBM  System/360  core-memory  cost  versus  core-memory  size. 


plotted  in  terms  of  $/(by/s)  and  allows  us  to  compute  the 
purchase  cost  of  a  bit.  The  purchase  cost  of  most  Mp.core  is 
$0.25/bit,  according  to  the  line.  The  8-,us  Large  Capacity  Stor- 
age/LCS  cost  is  $0.032/bit.  There  appear  to  be  slight  cost 
savings  for  large  Mp's  and  a  significant  saving  for  lower  per- 
formance m  the  case  of  LCS,  a  factor  of  8.  A  reasonable  formula 
for  Mp  cost  is:  c  =  (7  x  10''  X  i)/[t. cycle:  (/.is)].  This  formula 
would  account  for  Model  50  Mp  and  LCS  costs,  but  not  Model 
25  and  30  Mp  costs.  We  really  need  an  i'  -'  term  in  the  formula 
to  make  a  good  fit  (and  also  a  constant).  The  value  i'  -  should 
be  present,  if  purchase  prices  are  related  to  manufacturing 
costs,  because  coincident  current  selection  cost  is  inherently 
proportional  to  i'  -. 

An  odd  pricing  point  is  the  Model  44:  it  was  developed  after 
the  other  models  and  is  either  implemented  better  or  priced 
differently.  The  anomalies  in  Mp('65:  2"  words),  Mp('30:  2'^ 
words),  Mp('40;  2^'  bytes),  and  Mp('44)  are  undoubtedly  due 
to  pricing-strategy  differences.  In  the  case  of  the  Model  30  the 
incremental  cost  to  increase  the  Mp  size  from  2^'  to  2"'  bytes 
IS  the  addition  of  only  a  different  core  array  (with  no  change 


in  electronics),  at  a  small  incremental  manufacturing  cost  of 
goods. 

The  Mp  size  range  within  a  model  varies  by  a  factor  of  8 
for  Models  30,  40,  44,  50.  65,  and  75,  although  by  only  a  factor 
of  4  at  the  ends  of  the  line  (Models  20  and  91).  The  Mp  imple- 
mentation is  usually  a  single  common  set  of  electronics  to  drive 
2'"*  (16,384)  words  in  a  square  or  coincident  current-selection 
system  of  2'  by  2'.  These  square  points  are  indicated  on  the 
graph,  and  they  should  be  the  most  economical  memories. 
Smaller  Mp's  are  implemented  simply  by  using  smaller  core- 
memory  arrays,  but  with  the  same  basic  electronic  configura- 
tion, e.g..  the  Model  30  above.  Larger  Mp's  are  obtained  by 
replicating  the  whole  Mp  system  including  the  core  array  and 
the  electronics. 

An  Mp  size  range  of  8  for  a  given  model  presupposes  a 
certain  structuring  of  problems.  That  is,  the  models  assume 
a  fixed  relationship  between  Pc  capacity  and  Mp  size  require- 
ments. An  ideal  system  might  let  Pc  power,  Pc  quantity,  Mp 
power,  and  Mp  size  be  completely  variable.  These  parameters 
would  all  be  selected  independently  to  match  the  work  load. 


Pari  6  j  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


Central  processors 

The  relative  Pc  powers  (in  360  instructions/s)  and  costs  are 
given  in  the  graph  of  Fig.  19  and  in  Table  1.  The  most  signifi- 
cant fact  from  the  graph  is  that  the  cost/power  ratio  is  roughly 
constant  for  each  of  the  Pc's  (especially  if  we  ignore  Model  44 
and  IVIodel  50).  Figure  19  gives  the  relative  computing  power 
versus  cost  for  various  configurations.  Table  1  also  shows  a 
number  of  relationships.  One  interesting  relationship  (Table  1) 
is  the  ratio  of  actual  Pc  power  to  maximum  possible  Pc  power 
for  a  model.  This  can  be  based  on  Mp  utilization: 

Actual  Pc  power  Mp  cycles  utilized  by  Pc 

Maximum  Pc  power  Mp  cycles  available 

This  ratio  must  be  less  than  1  unless  there  are  many  Pc's  or 
a  single  Pc  has  more  power  than  Mp.  In  every  case,  the  Pc  is 
far  from  fully  utilizing  the  Mp.  The  technique  of  buffering  in- 
structions in  a  local  Pc  memory  can  increase  this  ratio  to  be 
>1  (although  no  computers  ever  do  so).  In  the  higher  model 


numbers  the  utilization  is  low  because  a  large  number  of  cycles 
have  to  be  available  in  order  to  avoid  conflicts  when  a  given 
cycle  is  requested— using  an  Mp  with  a  long  t. cycle.  In  the  case 
of  Model  25,  the  cycles  are  lost  because  the  microprogram  is 
being  executed  from  Mp.  (A  ratio  of  0.045  indicates  21  cycles 
are  used  for  microprograms  to  every  1  of  program.) 

In  the  case  of  the  Model  30  the  power  is  limited  by  holding 
the  general  registers  in  Mp.  For  example,  by  using  an  additional 
fast  M  to  hold  the  general  registers  and  working  data,  the  Pc 
power  could  increase.  Unfortunately,  such  a  change  might 
cause  the  cost  of  other  parts  of  the  system  to  be  increased, 
so  that  it  would  not  be  just  a  simple  incremental  addition.  The 
C('30)  performs  well  for  the  field-scan  problem  [Solomon,  1966] 
(see  Table  1).  The  data  structure  for  the  field-scan  problem 
coincides  with  the  1-byte  Mp  organization.  C('65)  and  C('75) 
perform  the  worst  for  field  scan  because  of  the  mismatch 
between  Mp  organization  (8  bytes)  and  program  data  (1  byte). 

C('65)  and  C('75)  have  the  same  Mp  structure  and  hence 
have  the  same  potential  power  available  from  Mp.  In  the  case 


0,1 


0.01 
Cost:  (  $/sec  ) 


Fig.  19.  Graph  of  IBM  System/360  cost/processing  power  ratio  versus  cost. 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range  585 


of  C('75)  the  power  of  the  Mp  is  more  nearly  utilized.  Unfortu- 
nately for  the  more  complex  Mp  structures,  which  have  more 
potential  Mp  cycles,  the  Pc  is  not  able  to  utilize  them.  The  C('65) 
and  C('75)  have  several  registers  concerned  with  obtaining  the 
next  instruction  and  holding  it  for  execution  while  other  in- 
structions are  obtained  (look-ahead).  The  hardwired  Model  75 
Pc  may  account  for  the  improvement  over  the  Model  65  P. mi- 
croprogrammed. 

The  performance  of  C('20)  is  inaccurately  high  since  it  is 
a  limited  subset  of  the  360  ISP.  (C('20)  does  not  have  float- 
ing-point or  fixed-point  multiply  and  divide  instructions,  and  it 
has  only  eight  16-bit  general  registers.)  The  hardwired  Model 
44  has  a  better  cost/power  characteristic  than  any  of  the  other 
C's,  by  any  measured  criteria  (see  Fig.  19).  In  the  case  of  the 
Model  44,  the  Pc  price  also  includes  Ms. disk.  Perhaps  the  Model 
44,  designed  initially  for  real-time  scientific  problem  solving, 
is  priced  more  competitively  with  similar  machines  (DEC  PDP-10 
and  SDS  Sigma  5,  7),  whereas  the  other  models  compete  in 
a  performance-insensitive,  competition-free  market  for  gen- 
eral-purpose business  data  processing.  Thus  its  anomalous 
position  may  be  due  to  external  market  pressures  and  not 
manufacturing  cost. 

The  design  of  the  IBM  System/360  models  is  undoubtedly 
predicated  on  the  basis  that  performance  or  computing  power 
IS  proportional  to  the  cost  raised  to  some  power,  g,  greater  than 
1:  power  =  k  x  cost^;  where  g  >  1.'  Almost  all  models  follow 
the  above  relationship  with  g  >  1.  When  g  >  1  there  is  an 
advantage  to  have  large  configurations  since  the  cost/computa- 
tion will  decrease.  If  g  <  1,  then  an  alternative  implementation 
for  the  360  C's  would  simply  use  multiple  C's  or  Pc's  to  obtain 
the  same  power.  Unfortunately,  such  an  approach  does  not 
provide  for  the  interconnection  of  the  components  to  function 
as  a  single  unit.  In  many  cases  a  single  task  cannot  be  broken 
into  a  number  of  parallel  and  independent  subtasks.  If  the 
performance  for  the  system  varied  by  a  factor  of  100,  then  100 
Pc's  or  C's  would  be  placed  together.  From  Table  1  we  see  a 
power  range  of  about  314  corresponds  to  a  cost  range  of  65 
to  114  (which  tells  us  g  <  2). 

The  following  discussion  takes  computing  power  to  be 
measured  by  instructions  per  second  and  Mp  (size:  t. cycle). 
Costs  are  measured  in  dollars  per  second  of  rental  time.  The 
graph  (Fig.  20)  shows  the  relationship  to  computing  power  p 
and  costs.  The  power  (actually  p.Pc)  is  taken  from  the  meas- 
ures of  instruction  times  for  certain  fixed  work.  Solomon  ob- 

'Herb  Grosch  [Grosch,  1953]  first  noted  this  relationship  and  estimated  g  to  be 
2;  thus  we  use  g  for  this  exponent.  Adams  suggested  g  =  [Adams,  1962]. 
See  also  The  Economics  of  Computers  [Sharpe,  1969], 


served  Grosch's  law  to  hold  for  Models  30,  40,  50.  65,  and  75. 
This  line  is  drawn  in  Fig.  20  for  C(cost. average).  Considering 
Models  20,  25,  44,  85,  and  91,  a  line  with  a  less  steep  slope 
might  fit  the  points  better.  If  we  consider  C(cost. minimum), 
g  <  2;  considering  only  Pc,  a  g  =  1  might  be  appropriate  (see 
Fig.  20)  in  which  the  power/cost  is  essentially  constant  with 
cost. 

Pc(cost)/Mp(cost.avg) :  =  c.Pc/c.avg.Mp  =  ^  1.1,  the  ra- 
tio of  processor  to  memory  cost 

C(cost.min)/C(cost.avg)  :  =  c.min.C/c.avg.C  =  ^  0.47,  the 
ratio  of  the  smallest  computer  configuration  to  an  average 
configuration 

Pc(cost)/C(cost.avg)  :  =  c.Pc/c.avg.C  =  -  0.23,  the  ratio 
of  processor  to  computer  cost 

These  are  averages  over  all  the  series  and  can  be  rather 
misleading.  For  example,  in  higher-numbered  models  the 
C(cost.min)/C(cost.avg)  :  =  c.min.C/c.avg.C  is  about  0.6. 
whereas  in  lower-numbered  models  the  ratio  is  0.3.  We  might 
have  expected  this,  since  it  indicates  that  a  higher  proportion 
of  system  cost  is  in  Ms  and  T  on  lower-number  models. 

An  alternative  computer  series  based  on  multiprocessing 

In  this  section  we  suggest  an  alternative  design  providing  a  wide 
range  of  computing  power  but  using  multiprocessing.  That  is, 
rather  than  building  a  higher-performance  model,  we  would 
have  multiple  lower-performance  models.  On  the  surface,  this 
appears  feasible  only  if  the  cost  of  the  processor  is  a  relatively 
small  part  of  the  computer,  and  if  for  a  particular  configuration 
there  are  memory  cycles  available  in  the  system  (so  that  a  more 
costly  memory  system  is  not  required).  It  is  also  desirable  that 
the  proposed  multiprocessor  configurations  have  rather  large 
Mp's  so  that  it  can  be  assumed  there  will  be  several  jobs  in 
Mp  waiting  to  run:  i.e.,  we  should  be  able  to  multiprogram  rather 
than  do  parallel  processing.  These  conditions  are  satisfied  with 
the  System/360  models.  Although  we  do  not  address  the  ques- 
tion of  development  cost,  it  is  clear  that  a  multiprocessor 
system  would  have  a  lower  development  cost  because  fewer 
processors  would  be  required.  Within  IBM  we  can  assume  that 
the  development  cost  tends  to  go  to  zero  because  of  the  large 
production:  unfortunately,  even  for  IBM,  the  training  cost  for 
servicemen  and  salesmen  does  not  go  to  zero  but  is  propor- 
tional to  the  number  of  products.  Thus,  we  would  anticipate 
savings  by  having  a  smaller  line. 

The  multiprocessor  view  is  presented  in  Table  4:  namely,  we 
suggest  dropping  Models  20,  30,  40,  50,  65,  75,  85,  and  91. 


Part  6  I  Computer  families 


Section  3  |  The  IBIVl  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


Model  # 
1000 


'o.OOOl  0  001  0.01  0.1  1.0 

See  Table  1  for  definition  Cost:($/sec) 


Fig.  20.  Graph  of  IBM  System/360  relative  processing  power  versus  cost. 


These  would  be  replaced  with  only  Models  25  and  44.  Note  there 
are  Pc's  in  Table  4  (other  than  25  and  44)  which  when  multi- 
processed  can  perform  better  for  lower  cost,  e.g.,  2  Model  65's 
are  >1  Model  75,  for  about  the  same  cost.  Admittedly  there 
are  major  problems  in  multiprocessing  with  11  Pc's,  but  other 
existence  proofs  [Anderson,  1961]  have  shown  that  two  to  four 
Pc's  can  be  effective  (Chap.  36).  If  we  ignore  Models  85  and  91 , 
the  worst  case  is  for  a  maximum  of  four  Pc's  needed  to  obtain 
the  power  of  model  40.  Note  that  in  the  above  cases  the  proces- 
sor cost  is  about  one-half  the  cost  of  a  single  Pc.  This  factor 
of  2  might  be  used  to  answer  critics  of  the  scheme.  The  reasons 
against  the  scheme  are:  There  have  to  be  good  switches  be- 
tween Mp  and  Pc's;  there  has  to  be  communication  among  the 
Pc's  (which  is  about  the  same  as  what  the  Pc-Pio  communica- 
tion should  be);  and  there  has  to  be  knowledge  of  the  program 
environment  to  split  tasks  apart  to  run  in  parallel. 

A  less  radical  suggestion  is  also  presented  in  Table  4: 
namely,  examining  the  number  of  processor  models  which  can 
be  used  to  provide  processing  power  for  the  next  highest  model. 


Actually,  if  we  carry  this  view  further  and  were  forced  to  build 
such  a  system,  the  view  that  the  ideal  machines  are  the  Model 
25  and  44  would  undoubtedly  change.  Model  25  and  44  exist 
and  can  be  used  for  the  argument.  The  reader  should  note  that 
there  is  a  major  flaw  in  our  argument  using  a  Model  25.  The 
microprogrammed  Model  25  Pc  cost  should  include  a  16-kby 
memory  for  the  microprogram  (actually  one  Mp  should  be 
included  for  each  Pc  to  avoid  memory-request  conflict).  Alter- 
natively, if  we  use  the  Model  25  directly  without  a  microprogram, 
we  would  lose  performance  range.  With  our  present  knowledge 
of  multiprocessors,  a  responsible  engineer  would  hardly  suggest 
building  a  multiprocessor  system  with  11  processors  as  a  sure- 
fire money-making  venture.  A  more  reasonable  alternative 
would  be  to  use  the  multiprocessor  Model  75  as  an  alternative 
to  Models  85  and  91.  A  reasonably  safe  alternative  would  be 
three  basic  processors  and  a  four-processor  multiprocessor 
structure.  For  a  power  range  of  320:1,  then  the  processors 
could  be  1,  20,  80,  giving  powers  of  1,  2,  3,  4,  20,  40,  60,  80, 
160,  240,  320.  This  structure  would  leave  a  gap  of  a  factor  of 


Section  3 

The  IBM  System/360- 

-a  series  of  planned  machines  i 

which  span  a 

wide  performance  range 

Table  4 

IBM  Systenri/360  Pc  (power:  cost)  and  an  alternative  design  based  on  multiprocessors 

Given 

Proposed  multiprocessor  alternatives 

Pc, model 

Pc.power 

Pc.cost 

Quantity.Pc 

Pc. model 

Pc.power 

Pc.cost 

1 

0.00049 

1 

0.0005 

OK 

1.5 

0.00050 

1 

1  R 

0.0005 

oU 

2 

0.0013 

2 

O 

O 

0,001 

2 

o 

u.uuuyis 

6 

0.003 

4 

25 

6  ' 

6 

20 

6 

30 

0.0041 

1 

44 

30 

0.004 1 

DU 

15 

0.012 

1 

44 

30 

0.0041 

DO 

63 

0.022 

2 

44 

60 

/D 

92 

0.037 

3 

44 

90 

U.U  i<i 

2 

65 

126 

0.044 

252 

0.087 

8 

44 

240 

0.033 

91 

314 

0.091 

11 

44 

330 

0.045 

5  between  a  4  x  1  power  processor  and  20  power  processor. 
The  largest  gap  in  the  System /360  is  a  factor  of  3  between 
Models  30  and  40. 

Conclusions 

The  IBIVl  System/360,  by  achieving  a  production  record,  has 
fulfilled  its  principal  design  objective.  The  technical  goals,  how- 
ever, are  of  interest  to  us  here.  The  most  interesting  aspect 
of  the  design  is  achieving  a  performance  range  of  314  to  1  over 
a  series  of  models,  with  a  primary-memory  size  range  of  2,048 
to  1  for  various  computer  configurations.  Thus  a  user  is  given 
a  very  large  set  of  configuration  alternatives.  The  SLT  technol- 
ogy, though  not  integrated-circuit,  is  certainly  of  the  third  gen- 
eration. Using  SLT  the  fabrication  of  the  models  is  superb. 

There  is  a  vast  array  of  secondary-memory  and  terminal 
devices  to  couple  with  almost  any  other  system.  The  Sys- 
tem/360 is  the  first  computer  to  make  extensive  use  of  micro- 
programming. Microprogramming  is  used  for  the  definition  of 
the  System/360  instruction-set  processor,  but,  more  important, 
microprograms  define  previous  IBM  computers  so  that  a  user 
can  operate  satisfactorily  during  the  interim  period  when  older 
programs  are  being  updated  to  use  the  System/360.  There  are 
provisions  for  multicomputer  structures.  Within  a  single  com- 
puter structure  there  is  adequate  means  of  peripheral  switching 
so  that  reliable  and  high-performance  structures  can  be  as- 
sembled. Early  structures  do  not  provide  multiprocessing;  we 
have  suggested  multiprocessing  as  a  technique  to  achieve  the 
same  performance-range  objectives.  The  lo  processor,  though 
rather  elaborate,  provides  a  certain  commonality. 


The  instruction-set  processor  for  the  System/360,  based  on 
a  general-registers  structure,  appears  to  be  overly  complex,  yet 
incomplete,  because  there  are  so  many  data  types.  The  address- 
ing mechanism  and  lack  of  multiprogramming  ability  make 
the  System/360  a  hard  machine  to  appreciate  fully.  Although 
we  praise  microprogramming  as  a  means  of  accomplishing 
compatibility  with  the  past,  it  appears  to  stand  in  the  way  of 
getting  the  most  performance  from  the  hardware.  Perhaps  of 
most  significance,  the  System/360  may  have  a  greater  lifetime 
than  any  past  computer. 

Selected  Bibliography 

.Architecture  and  logical  structure:  .\mdaG64a  (TeagH65)',  BlaaG64a'-, 
BlaaG64b-;  General  implementations:  .\mdaG64b-,  Cart\V64,  PadeA64-, 
Stev\V64-;  Microprogramming:  GreeJ6i4,  TuckS67,  \\'ebeH6T;  Formal  de- 
scription of  Pc^;  FalkA64-;  Performance  and  reviews:  HillJ66,  S0I0M66; 
Model  40  modifications  for  multiprogramming:  Lind.,\66;  Model  67; 
■\rdeB66,  FikeRf)8,  GibsC66,  LaueH67;  Model  H5:  ContCeS^,  LiptJ683, 
Pade.\68-';  Model  91  architecture  and  technology':  .\ndeD67'',  .■UideS67^, 
BolaL67<.  FI>-n.M67^a.  LangJ67'.  LloyR67<.  SechR67^  TomaR67*:  .Model 
92  (proposed):  ContC64  (GrimR6.5a),  .\mdaG64c  (GrimR6.5l3),  ChenT64 
(GrimR6.5c);  Serviceabilit\":  Cart\\'64;  Other  references:  -\damC62, 
CorbF62,  GrosH5.3.  SharW69.  \\ilk.M6.5;  IBM  reference  manuals:  IBM 
System/.360  Functional  characteristics  manuals  for  each  model.  IBM  Sys- 
tem/36()  Configurator  (diagram)  for  each  model,  .\22-6821-4  IBM  Sys- 
tem/3«)  Principles  of  Operation,  .\22-6S10-S  IBM  System/360  System 
Summary 

'1  I  denotes  the  review  of  previous  article. 
-IBM  Systenis  Journal  vol.  3.  nos.  2  and  .3,  1964. 
^IBM  Systems  Journal,  vol.  7,  no.  1,  1968. 

^  IBM  Journal  of  Research  and  Development,  vol.  11,  no.  1,  Januarv  ,  1967. 
^Given  in  A  Programming  Language/ APL  [Iverson,  1962]. 


Chapter  43 

The  structure  of  system/ 360 ^ 

Part  I— Outline  of  the  logical  structure 


G.  A.  Blaauw  /  F.  P.  Brooks.  Jr. 

Summary  A  general  introductory  description  of  the  logical  structure  of 
system/360  is  given.  In  addition,  the  functional  units,  the  principal  regis- 
ters and  formats,  and  the  basic  addressing  and  sequencing  principles  of 
the  system  are  indicated. 

In  the  system/360  logical  structure,  processing  efficiency  and 
versatility  are  served  by  multiple  accumulators,  binary  addressing, 
bit-manipulation  operations,  automatic  indexing,  fi.\ed  and  variable 
field  lengths,  decimal  and  hexadecimal  radices,  and  floating-point 
as  well  as  fi.xed-point  arithmetic.  The  provisions  for  program 
interruption,  storage  protection,  and  flexible  CPU  states  contribute 
to  effective  operation.  Base-register  addressing,  the  standard  in- 
terface between  channels  and  input/output  control  units,  and  the 
machine-language  compatibility  among  models  contribute  to  flex- 
ible configurations  and  to  orderly  system  expansion. 

SYSTEM  .360  is  distinguished  hy  a  design  orientation  toward 
very  large  memories  and  a  hierarchy  of  memory  speeds,  a  broad 
spectrum  of  manipulative  fimctions,  and  a  uniform  treatment  of 
input/output  fimctions  that  facilitates  communication  with  a 
diversity  of  input/output  devices.  The  overall  structure  lends 
itself  to  program-compatible  embodiments  over  a  wide  range  of 
performance  levels. 

The  system,  designed  for  operation  with  a  supervisory  pro- 
gram, has  comprehensive  facilities  for  storage  protection,  program 
relocation,  nonstop  operation,  and  program  interruption.  Privi- 
leged instructions  associated  with  a  supervisory  operating  state 
are  included.  The  supervisory  program  schedules  and  governs  the 
execution  of  multiple  programs,  handles  exceptional  conditions, 
and  coordinates  and  issues  input/output  (I/O)  instructions.  Relia- 
bility is  heightened  by  supplementing  solid-state  components  with 
built-in  checking  and  diagnostic  aids.  Interconnection  facilities 
permit  a  wide  variety  of  possibilities  for  multisystem  operation. 

The  purpose  of  this  discussion  is  to  introduce  the  fimctional 
units  of  the  system,  as  well  as  formats,  codes,  and  conventions 
essential  to  characterization  of  the  system. 

'/BA/  Si/s.  /,  vol.  ,3,  no.  2,  pp.  119-13.5,  1964. 


Functional  structure 

The  system/360  structure  schematically  outlined  in  Fig.  1  has 
seven  announced  embodiments.  Six  of  these,  namely.  Models  30, 
40,  50,  60,  62,  and  70,  will  be  treated  here.'  Where  requisite  I/O 
devices,  optional  features,  and  storage  capacity  are  present,  these 
six  models  are  logically  identical  for  valid  programs  that  contain 
explicit  time  dependencies  only.  Hence,  even  though  the  allow- 
able channels  or  storage  capacity  may  vary  from  model  to  model 
(as  discussed  in  Chap.  44),  the  logical  structure  can  be  discussed 
without  reference  to  specific  models. 

Input /output 

Direct  communication  with  a  large  number  of  low-speed  terminals 
and  other  I/O  devices  is  provided  through  a  special  multiplexor 
channel  unit.  Communication  with  high-speed  I/O  devices  is 
accommodated  by  the  selector  channel  units.  Conceptually,  the 
input/output  system  acts  as  a  set  of  subchannels  that  operate 
concurrently  with  one  another  and  the  processing  unit.  Each 
subchannel,  instnicted  bv  its  own  control-word  sequence,  can 
govern  a  data  transfer  operation  between  storage  and  a  selected 
I/O  device.  A  multiplexor  channel  can  function  either  as  one  or 
as  many  subchannels:  a  selector  channel  always  functions  as  a 
single  subchannel.  The  control  unit  of  each  I/O  device  attaches 
to  the  channels  via  a  standard  mechanical-electrical-programming 
interface. 

Processing 

The  processing  imit  has  sixteen  general  purpose  32-bit  registers 
used  for  addressing,  indexing,  and  accumulating.  Four  64-bit 
floating-point  accumulators  are  optionally  available.  The  inclusion 
of  multiple  registers  permits  efi^ective  use  to  be  made  of  small 
high-speed  memories.  Four  distinct  types  of  processing  are  pro- 

',\  seventh  embodiment,  the  Model  92,  is  not  discussed  in  this  paper.  This 
model  does  not  provide  decimal  data  handling  and  has  a  few  minor  differ- 
ences arising  from  its  highly  concurrent,  speed-oriented  organization.  .\ 
paper  on  Model  92  is  planned  for  future  publication  in  the  IBM  Systems 
Journal. 


588 


Chapter  43  |  The  structure  of  SYSTEM/ 360  589 


MAIN 

STORAGE 
AND 
LARGE 
CAPACITY 
STORAGE 


MULTIPLEXOR 

(MULTIPLE 
LOW  SPEED 
SUBCHANNELS) 


(SINGLE 
HIGH  SPEED 
SUBCHANNEL) 


(SINGLE 
HIGH  SPEED 
SUBCHANNEL) 


ARITHMETIC  AND  LOGIC 


PROCESSING  UNIT 


Fig.  1.  Functional  schematic  of  System/360. 

vided:  logical  manipulation  of  individual  bits,  character  strings  and 
fixed  words;  decimal  arithmetic  on  digit  strings;  fixed-point  binary 
arithmetic;  and  floating-point  arithmetic.  The  processing  unit, 
together  with  the  central  control  function,  will  be  referred  to  as 
the  central  processing  imit  (CPU).  The  basic  registers  and  data 
paths  of  the  CPU  are  shown  in  Fig.  2. 

The  CPU's  of  the  various  models  yield  a  substantial  range  in 
performance.  Relative  to  the  smallest  model  (Model  .30).  the  in- 
ternal performance  of  the  largest  (Model  70)  is  approximately  50:1 
for  scientific  computation  and  1.5: 1  for  commercial  data  processing. 


INPUT/OUTPUT 

CONTROL  UNITS  DEVICES 

,  I  \  


Control 

Because  of  the  extensive  instruction  set,  system/360  control  is 
more  elaborate  than  in  conventional  computers.  Control  functions 
include  internal  sequencing  of  each  operation;  sequencing  from 
instruction  to  instruction  (with  branching  and  interruption);  gov- 
erning of  manv  I/O  transfers;  and  the  monitoring,  signaling,  tim- 
ing, and  storage  protection  essential  to  total  system  operation.  The 
control  equipment  is  combined  with  a  programmed  supervisor, 
which  coordinates  and  issues  all  I/O  instructions,  handles  excep- 


Part  6  I  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


STORAGE  ADDRESS 


MAIN  STORAGE 


INSTRUCTIONS 


COMPUTER 
SYSTEM 
CONTROL 


I  


FIXED  POINT 
OPERATIONS 


VARIABLE 
FIELD  LENGTH 
OPERATIONS 


FLOATING-POINT 
OPERATIONS 


16 

GENERAL 
REGISTERS 


4  FLOATING-POINT  REGISTERS 


Fig.  2.  Schematic  of  basic  registers  and  data  paths. 


tional  conditions,  loads  and  relocates  programs  and  data,  manages 
storage,  and  supervises  scheduling  and  execution  of  multiple  pro- 
grams. To  a  problem  programmer,  the  supervisory  program  and 
the  control  equipment  are  indistinguishable. 

The  functional  structure  of  system/.36(),  like  that  of  most 
computers,  is  most  concisely  described  by  considering  the  data 
formats,  the  types  of  manipulations  performed  on  them,  and  the 
instruction  formats  by  which  these  manipulations  are  specified. 

Information  formats 

The  several  system/360  data  formats  are  shown  in  Fig.  3.  An  8-bit 
unit  of  information  is  fimdamental  to  most  of  the  formats.  A 
consecutive  group  of  n  such  units  constitutes  a  field  of  length  n. 
Fixed-length  fields  of  length  one,  two,  four,  and  eight  are  termed 
bytes,  halfwords,  words,  and  double  words,  respectively.  In  many 
instructions,  the  operation  code  implies  one  of  these  four  fields 
as  the  length  of  the  operands.  On  the  other  hand,  the  length  is 
explicit  in  an  instruction  that  refers  to  operands  of  variable  length. 

The  location  of  a  stored  field  is  specified  bv  the  address  of  the 
leftmost  byte  of  the  field.  Variable-length  fields  may  start  on  any 
byte  location,  but  a  fixed-length  field  of  two,  four,  or  eight  bytes 


must  have  an  address  that  is  a  multiple  of  2,  4,  or  8,  respectivelv. 
Some  of  the  various  alignment  possibilities  are  apparent  from 
Fig.  3. 

Storage  addresses  are  represented  by  binary  integers  in  the 
system.  Storage  capacities  are  always  expressed  as  numbers  of 
bytes. 

Processing  operations 

The  SYSTE,vi/360  operations  fall  into  four  classes:  fixed-point  arith- 
metic, floating-point  arithmetic,  logical  operations,  and  decimal 
arithmetic.  These  classes  differ  in  the  data  formats  used,  the 
registers  involved,  the  operations  provided,  and  the  way  the  field 
length  is  stated. 

Fixed-point  arithmetic 

The  basic  arithmetic  operand  is  the  32-bit  fixed-point  binary  word. 
Halfword  operands  may  be  specified  in  most  operations  for  the 
sake  of  improved  speed  or  storage  utilization.  Some  products  and 
all  dividends  are  64  bits  long,  using  an  even-odd  register  pair. 

Because  the  32-bit  words  accommodate  the  24-bit  address,  the 
entire  ftxed-point  instruction  set,  including  multiplication,  division. 


Chapter  43  |  The  structure  of  system/ 360  591 


-DOUBLE  WORD- 


'HALFWORD  FIXED  POINT  NUMBER 


FULLWORD  FIXED  POINT  NUMBER 


31 

INTEGER 


0 

SHORT  FLOATING  PC 

NT  NUMBER 

31 

S 

7 

CHARACTERISTIC 

24 

FRACTION 

LONG  FLOATING  POINT  NUMBER 


CHARACTERISTIC 


56 

FRACTION 


0 

63 

PACKED  DECIMAL  NUMBER 

4 

DIGIT 

4 

DIGIT 

4 

DIGIT 

ZONED  DECIMAL  NUMBER 

4 

ZONE 

4 

DIGIT 

4 

ZONE 

4 

DIGIT 

FIXED  LENGTH  LOGICAL  INFORMATION 


LOGICAL  DATA 


VARIABLE  LENGTH  LOGICAL  INFORMATION 


8 

8 

8 

CHARACTER 

CHARACTER 

CHARACTER 

Fig.  3.  The  data  formats. 


shifting,  and  several  logical  operations,  can  be  used  in  address 
computation.  ,\  two's  complement  notation  is  used  for  fixed-point 
operands. 

Additions,  subtractions,  multiplications,  divisions,  and  com- 
parisons take  one  operand  from  a  register  and  another  from  either 
a  register  or  storage.  Multiple-precision  arithmetic  is  made  con- 
venient bv  the  two's  complement  notation  and  by  recognition  of 
the  carry  from  one  word  to  another.  A  pair  of  conversion  instruc- 


tions. COW'ERT  TO  BINARY  and  CONVERT  TO  DECIMAL, 
provide  transition  between  decimal  and  binary  radices  without 
the  use  of  tables.  Multiple-register  loading  and  storing  instructions 
facilitate  subroutine  switching. 

Floating-point  arithmetic 

Floating-point  numbers  may  occur  in  either  of  two  fi.xed-length 
formats — short  or  long.  These  formats  differ  only  in  the  length  of 


Part  6  I  Computer  families 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


the  fractions,  as  indicated  in  Fig.  3.  The  fraction  of  a  floating-point 
number  is  expressed  in  4-bit  hexadecimal  (base  16)  digits.  In  the 
short  format,  the  fraction  has  six  hexadecimal  digits;  in  the  long 
format,  the  fraction  has  14  hexadecimal  digits.  The  short  length 
is  equivalent  to  seven  decimal  places  of  precision.  The  long  length 
gives  up  to  17  decimal  places  of  precision,  thus  eliminating  most 
requirements  for  double-precision  arithmetic. 

The  radix  point  of  the  fraction  is  assumed  to  be  immediately 
to  the  left  of  the  high-order  fraction  digit.  To  provide  the  proper 
magnitude  for  the  floating-point  number,  the  fraction  is  considered 
to  be  multiplied  by  a  power  of  16.  The  characteristic  portion,  bits 
1  through  7  of  both  formats,  is  used  to  indicate  this  power.  The 
characteristic  is  treated  as  an  excess  64  number  with  a  range  from 


—  64  through  +  63,  and  permits  representation  of  decimal  numbers 
with  magnitudes  in  the  range  of  lO"'^*  to  10^^. 

Bit  position  0  in  either  format  is  the  fraction  sign,  S.  The 
fraction  of  negative  numbers  is  carried  in  true  form. 

Floating-point  operations  are  performed  with  one  operand  from 
a  register  and  another  from  either  a  register  or  storage.  The  result, 
placed  in  a  register,  is  generally  of  the  same  length  as  the  operands. 

Logical  operations 

Operations  for  comparison,  translation,  editing,  bit  testing,  and 
bit  setting  are  provided  for  processing  logical  fields  of  iixed  and 
variable  lengths.  Fixed-length  logical  operands,  which  consist  of 
one,  four,  or  eight  bytes,  are  processed  from  the  general  registers. 


BIT  POSITIONS- 


4567 
0000 
0001 
0010 
0011 
0100 
0101 
0110 
0111 
1000 
1001 
1010 
1011 
1100 
1101 
1110 

nil 


NULL 

PF 

RES 

BYP 

PN 

HT 

NL 

LF 

RS 

LC 

BS 

EOB 

UC 

DEL 

IL 

PRE 

EOT 

SM 

Pf    Punch  oH  BS  Backspace 

HT    Horizontal  Ub  IL  Idle 

LC    Lo*-*rcase  BYP  Bypass 

DEL  Delete  LF     Line  teed 

RES  Restore  EOB  End  of  block 

NL    New  l.ne  PRE  Pretm 


  01  

01  10 


SP 

/ 

« 

j 

I 

# 

< 

« 

% 

@ 

( 

) 

-t- 

> 

1 

SM  Set  mode 

PN  Punch  on 

RS  Reader  stop 

UC  Uppercase 

EOT  End  of  transr 

SP  Space 


  11   

01  10 


Fig.  4.  Extended  binary-coded-decimal  interchange  code. 


Chapter  43  [  The  structure  of  SYSTEM/ 360  593 


BIT  POSITIONS- 


4321 

00 

01 

10 

11 

0000 

NULL 

OLE 

0001 

SOH 

DCl 

0010 

STX 

DC2 

0011 

ETX 

DC3 

0100 

EOT 

DC4 

0101 

ENQ 

NACK 

0110 

ACK 

SYNC 

0111 

BEa 

ETB 

1000 

BS 

CNCL 

1001 

HT 

EM 

1010 

LF 

SS 

1011 

VT 

ESC 

1100 

FF 

FS 

1101 

CR 

GS 

1110 

SO 

RS 

1111 

SI 

US 

SP 

0 

! 

1 

2 

# 

3 

$ 

4 

% 

5 

6 

• 

7 

( 

8 

) 

9 

• 

+ 



< 

> 

P 

A 

Q 

B 

R 

C 

S 

D 

T 

E 

U 

F 

V 

G 

w 

H 

X 

1 

Y 

J 

Z 

K 

( 

L 

CS2 

M 

) 

N 

0 

@ 

P 

a 

q 

b 

, 

c 

s 

d 

t 

« 

u 

f 

8 

w 

h 

X 

, 

j 

z 

k 

{ 

1 

1 

m 

> 

n 

0 

DEL 

'Htird  ISO  draft  propoul  tor  6  and  7  brt  coded  Ctt»i 
NULL    Null  /idle  HT  Hi 

SOH      Start  ol  tie«d<ng  Lf  L. 

STX       Start  of  tert  " 


t  sets  lor  information  processing  interchange,  I 

e  control 


rrx  Endo 

Endofli 


EOT 

ENQ  Enqu.ry 

ACK  Acknowledge 

BELL  Audible  or  anention 

BS  BacKipace 


Ff  form  leed 

CR  Carriage  ret 

50  Shift  out 

51  Shift  .n 
DLE  Data  imk  es 
DCl  Dwriceconti 


DC2  Devic 

003  Devic 

DC4  Devic 

NACK  Negative 

SYNC  Synchron 

ETB  End  ol  tri 

CNCL  Car>cel 


EM 

SS 


il  Standards  Organizatwn.  June  1964 

ESC  Escape 

FS  File  separator 

GS  Group  separator 

RS  Record  separator 

US  Unit  separator 

SP  Space,  normally  non  pnnting 

CS2  Currency  symbol 


Start  ol  special  s«qui 


DEL  Dele 


:cent 


Fig.  5.  Eight-bit  representation  for  proposed  international  code. 


Logical  operations  can  also  be  performed  on  fields  of  up  to  256 
bvtes,  in  which  case  the  fields  are  processed  from  left  to  right, 
one  bvte  at  a  time.  Moreover,  two  powerful  scanning  instructions 
permit  byte-by-byte  translation  and  testing  via  tables.  .\n  impor- 
tant special  case  of  variable-length  logical  operations  is  the  one- 
byte  field,  whose  individual  bits  can  be  tested,  set,  reset,  and 
inverted  as  specified  bv  an  8-bit  mask  in  the  instmction. 

Character  codes 

Any  S-bit  character  set  can  be  processed,  although  certain  restric- 
tions are  assumed  in  the  decimal  arithmetic  and  editing  operations. 
However,  all  character-set-sensitive  I/O  equipment  assumes  either 
the  Extended  Binary-Coded-Decimal  Interchange  Code  (EBCDIC) 


of  Fig.  4  or  the  code  of  Fig.  5,  which  is  an  eight-bit  extension 
of  a  seven-bit  code  proposed  by  the  International  Standards  Orga- 
nization. 

Decimal  arithmetic 

Decimal  arithmetic  can  improve  performance  for  processes  re- 
quiring few  computational  steps  per  datum  between  the  source 
input  and  the  output.  In  these  cases,  where  radix  conversion  from 
decimal  to  binary  and  back  to  decimal  is  not  justified,  the  use  of 
registers  for  intermediate  results  usuallv  vields  no  advantage  over 
storage-to-storage  processing.  Hence,  decimal  arithmetic  is  pro- 
vided in  system/360  with  operands  as  well  as  results  located  in 
storage,  as  in  the  IBM  1400  series.  Decimal  arithmetic  includes 


594   Part  6  |  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


addition,  subtraction,  multiplication,  division,  and  comparison. 

The  decimal  digits  0  through  9  are  represented  in  the  4-bit 
binary-coded-decimal  form  by  0000  through  1001,  respectively. 
The  patterns  1010  through  1111  are  not  valid  as  digits  and  are 
interpreted  as  sign  codes:  1011  and  1101  represent  a  minus,  the 
other  four  a  plus.  The  sign  patterns  generated  in  decimal  arithme- 
tic depend  upon  the  character  set  preferred.  For  EBCDIC,  the 
patterns  are  1100  and  1101;  for  the  code  of  Fig.  5,  they  are  1010 
and  1011.  The  choice  between  the  two  codes  is  determined  by 
a  mode  bit. 

Decimal  digits,  packed  two  to  a  byte,  appear  in  fields  of  variable 
length  (from  1  to  16  bytes)  and  are  accompanied  by  a  sign  in  the 
rightmost  four  bits  of  the  low-order  byte.  Operand  fields  can  be 
located  on  any  bvte  boinidary,  and  can  have  lengths  up  to  .31  digits 
and  sign.  Operands  participating  in  an  operation  have  independent 
lengths.  Negative  numbers  are  carried  in  tnie  form.  Instructions 
are  provided  for  packing  and  unpacking  decimal  numbers.  Packing 
of  digits  leads  to  efficient  use  of  storage,  increased  arithmetic 
performance,  and  improved  rates  of  data  transmission.  For  purely 
decimal  fields,  for  example,  a  90,000-byte/.second  tape  drive  reads 
and  writes  180,000  digits/second. 

Instruction  formats 

Instruction  formats  contain  one,  two,  or  three  halfwords,  depend- 
ing upon  the  number  of  storage  addresses  necessary  for  the  opera- 
tion. If  no  storage  address  is  required  of  an  instruction,  one  half- 
word  suffices.  A  two-halfword  instruction  specifies  one  address;  a 
three-halfword  instruction  specifies  two  addresses.  All  instructions 
must  be  aligned  on  halfword  boundaries. 

The  five  basic  instruction  formats,  denoted  by  the  format 
mnemonics  RR,  RX,  RS,  SI,  and  SS  are  shown  in  Fig.  6.  RR  denotes 
a  register-to-register  operation,  RX  a  register  and  indexed-storage 
operation,  RS  a  register  and  storage  operation,  SI  a  storage  and 
immediate-operand  operation,  and  SS  a  storage-to-storage  opera- 
tion. 

In  each  format,  the  first  instruction  halfword  consists  of  two 
parts.  The  first  byte  contains  the  operation  code.  The  length  and 
format  of  an  instruction  are  indicated  by  the  first  two  bits  of  the 
operation  code. 

The  second  byte  is  used  either  as  two  4-bit  fields  or  as  a  single 
8-bit  field.  This  byte  is  specified  from  among  the  following: 

Four-bit  operand  register  designator  (R) 

Four-bit  index  register  designator  (X) 

Four-bit  mask  (M) 

Four-bit  field  length  specification  (L) 


Eight-bit  field  length  specification 
Eight-bit  byte  of  immediate  data  (1) 

The  second  and  third  halfwords  each  specify  a  4-bit  base 
register  designator  (B),  followed  by  a  12-bit  displacement  (D). 

Addressing 

An  effective  storage  address  E  is  a  24-bit  binary  integer  given, 
in  the  typical  case,  by 

E  =  B  -1-  .V  -I-  D 

where  B  and  X  are  24-bit  integers  from  general  registers  identified 
by  fields  B  and  X,  respectively,  and  the  displacement  D  is  a  12-bit 
integer  contained  in  every  instruction  that  references  storage. 

The  base  B  can  be  used  for  static  relocation  of  programs  and 
data.  In  record  processing,  the  base  can  identify  a  record;  in  array 
calculations,  it  can  specify  the  location  of  an  array.  The  index  X 
can  provide  the  relative  address  of  an  element  within  an  array. 
Together,  B  and  X  permit  double  indexing  in  array  processing. 

The  displacement  provides  for  relative  addressing  of  up  to  4095 
bytes  bevond  the  element  or  base  address.  In  array  calculations, 
the  displacement  can  identify  one  of  many  items  associated  with 
an  element.  Thus,  multiple  arrays  whose  indices  move  together 
are  best  stored  in  an  interleaved  manner.  In  the  processing  of 
records,  the  displacement  can  identify  items  within  a  record. 

In  forming  an  effective  address,  the  base  and  index  are  treated 
as  unsigned  24-bit  positive  binary  integers  and  the  displacement 
as  a  12-bit  positive  binary  integer.  The  three  are  added  as  24-bit 
binary  numbers,  ignoring  overflow.  Since  every  address  is  formed 
with  the  aid  of  a  base,  programs  can  be  readily  and  generally 
relocated  by  changing  the  contents  of  base  registers. 

A  zero  base  or  index  designator  implies  that  a  zero  quantity 
must  be  used  in  forming  the  address,  regardless  of  the  contents 
of  general  register  0.  A  displacement  of  zero  has  no  special  signifi- 
cance. Initialization,  modification,  and  testing  of  bases  and  indices 
can  be  carried  out  by  fixed-point  instructions,  or  by  BRANCH 
AND  LINK,  BRANCH  ON  COUNT,  or  BRANCH  ON  INDEX 
instructions.  LOAD  EFFECTIVE  ADDRESS  provides  not  only  a 
convenient  housekeeping  operation,  but  also,  when  the  same 
register  is  specified  for  result  and  operand,  an  immediate  register- 
incrementing  operation. 

Sequencing 

Normally,  the  CPU  takes  instructions  in  sequence.  After  an  in- 
struction is  fetched  from  a  location  specified  by  the  instruction 


Chapter  43  j  The  structure  of  sysTEIvi/360  595 


FIRST  HALFWORD 


SECOND  HALFWORD 


REGISTER 
OPERANDS 
1  2 


RX  FORMAT 


7  8  1112  15 

REGISTER 
OPERAND 
1 


STORAGE 
OPERAND 
2 


7  8         11  12        15116         19  20 

IMMEDIATE 
OPERAND 

2 


STORAGE 
OPERAND 
1 


THIRD  HALFWORD 


OP  CODE 

R 

X 

B 

D 

0  7 

8        11  12  15 

REGISTER 
OPERANDS 
1  3 

16        19  20  31 

STORAGE 
OPERAND 
2 

OP  CODE 

R 

R 

B 

D 

OPCODE 

1 

B 

D 

STORAGE 
OPERAND 
2 

0                          7  8 

OPEF 
LENC 
1 

15 

(AND 

;ths 

2 

16         19  20  31 

STORAGE 
OPERAND 
1 

'  \ 

OP  CODE 

L 

L 

B 

D 

B 

D 

11  12        15  16         19  20 


Fig.  6.  Five  basic  instruction  formats. 


counter,  the  instruction  counter  is  increased  by  the  number  of 
bytes  in  the  instruction. 

Conceptually,  all  halfwords  of  an  instruction  are  fetched  from 
storage  after  the  preceding  operation  is  completed  and  before 
execution  of  the  current  operation,  even  though  physical  storage 
word  size  and  overlap  of  instruction  execution  with  storage  access 
may  cause  the  actual  instruction  fetching  to  be  different.  Thus, 
an  instniction  can  be  modified  by  the  instruction  that  immediately 
precedes  it  in  the  instruction  stream,  and  cannot  effectively  modify 
itself  during  execution. 

Branching 

Most  branching  is  accomplished  by  a  single  BRANCH  ON  CON- 
DITION operation  that  inspects  a  2-bit  condition  register.  Many 


of  the  arithmetic,  logical,  and  I/O  operations  indicate  an  outcome 
bv  setting  the  condition  register  to  one  of  its  four  possible  states. 
Subsequently  a  conditional  branch  can  select  one  of  the  states 
as  a  criterion  for  branching.  For  e.xample,  the  condition  code 
reflects  such  conditions  as  non-zero  result,  first  operand  high, 
operands  equal,  overflow,  channel  busy,  zero,  etc.  Once  set, 
the  condition  register  remains  unchanged  until  modified  by 
an  instruction  execution  that  reflects  a  different  condition 
code. 

The  outcome  of  address  arithmetic  and  counting  operations 
can  be  tested  by  a  conditional  branch  to  effect  loop  control.  Two 
instructions,  BRANCH  ON  COUNT  and  BRANCH  ON  INDEX, 
provide  for  one-instruction  execution  of  the  most  common  arith- 
metic-test combinations. 


596  Part  6  |  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


INTIRRUPI  coot 


PROG 
MASK 


INSTRUCTION  ADDRESS 


SYSTEM  MASK- MPX  channel 

SEL  channels  1  6 
External 


CMWP-  cli3t.iLtef  set  mode 
Mflch  check 
Wait  state 
Problem  state 


lie-  Instfuction  length  code 

CC-  Condition  code 

PROGRAM  MASK- F(«ed  point  overflow 
decmidl  overtluw 
exponent  underflow 
significance 


Fig.  7.  Program  status  word  format. 


Program  status  word 

A  program  status  word  (PSW),  a  double  word  having  the  format 
shown  in  Fig.  7,  contains  information  required  for  proper  execution 
of  a  given  program.  A  PSW  inckides  an  instniction  address,  con- 
dition code,  and  several  mask  and  mode  fields.  The  active  or 
controlling  PSW  is  called  the  current  PSW.  By  storing  the  current 
PSW  during  an  interruption,  the  status  of  the  interrupted  program 
is  preserved. 

Interruption 

Five  classes  of  interruption  conditions  are  distinguished:  input/ 
output,  program,  supervisor  call,  external,  and  machine  check. 

For  each  class,  two  PSW's,  called  old  and  neiv,  are  maintained 
in  the  main-storage  locations  shown  in  Table  1.  An  interruption 
in  a  given  class  stores  the  current  PSW  as  an  old  PSW  and  then 
takes  the  corresponding  new  PSW  as  the  current  PSW.  If,  at  the 
conclusion  of  the  interruption  routine,  old  and  current  PSW's  are 
interchanged,  the  system  can  be  restored  to  its  prior  state  and  the 
interrupted  routine  can  be  continued. 

The  system  mask,  program  mask,  and  machine-check  mask  bits 
in  the  PSW  may  be  used  to  control  certain  interniptions.  When 
masked  off,  some  interniptions  remain  pending  while  others  are 
merely  ignored.  The  system  mask  can  keep  I/O  and  external 
interruptions  pending,  the  program  mask  can  cause  four  of  the 
15  program  interruptions  to  be  ignored,  and  the  machine-check 
mask  can  cause  machine-check  interruptions  to  be  ignored.  Other 
interruptions  cannot  be  masked  off. 

Appropriate  CPU  response  to  a  special  condition  in  the  chan- 
nels and  I/O  units  is  facilitated  by  an  I/O  interruption.  The 


addresses  of  the  channel  and  I/O  unit  involved  are  recorded  in 
the  old  PSW.  Related  information  is  preserved  in  a  channel  status 
word  that  is  stored  as  a  result  of  the  interruption. 

Unusual  conditions  encountered  in  a  program  create  program 
interruptions.  Eight  of  the  fifteen  possible  conditions  involve  over- 
flows, improper  divides,  lost  significance,  and  exponent  underflow. 


Table  1    Permanent  storage  assignments 


Address 

Btjh-  length 

Piirimsc 

0 

8 

initial  program  loading  PSW 

8 

8 

Initial  program  loading  CCW  1 

16 

8 

Initial  program  loading  CCW  2 

24 

8 

External  old  PSW 

32 

8 

Supervisor  call  old  PSW 

40 

8 

Program  old  PSW 

48 

8 

Machine  checl<  old  PSW 

56 

8 

Input/output  old  PSW 

64 

8 

Channel  status  word 

72 

4 

Channel  address  word 

76 

4 

Unused 

80 

4 

Timer 

84 

4 

Unused 

88 

8 

External  new  PSW 

96 

8 

Supervisor  call  new  PSW 

104 

8 

Program  new  PSW 

112 

8 

Machine  check  new  PSW 

120 

8 

Input  output  new  PSW 

128 

Diagnostic  scan-out  areat 

t  The  size  of  the  dia 

gnostic  scan-out  area 

is  configuration  dependent. 

Chapter  43  [  The  structure  of  SYSTEM/ 360  597 


The  remaining  seven  deal  with  improper  addresses,  attempted 
execution  of  privileged  instructions,  and  similar  conditions. 

A  siipenisor-caU  interruption  results  from  execution  of  the 
instniction  SUPERVISOR  CALL.  Eight  bits  from  the  instruction 
format  are  placed  in  the  interruption  code  of  the  old  PSW,  per- 
mitting a  message  to  be  a.ssociated  with  the  interniption.  SUPER- 
VISOR CALL  permits  a  problem  program  to  switch  CPU  control 
back  to  the  supervisor. 

Through  an  external  interruption,  a  CPU  can  respond  to  signals 
from  the  interruption  key  on  the  system  control  panel,  the  timer, 
other  CPU's,  or  special  devices.  The  source  of  the  interniption 
is  identified  bv  an  interruption  code  in  bits  24  through  31  of  the 
PSW. 

The  occurrence  of  a  machine  check  (if  not  masked  off)  termi- 
nates the  current  instruction,  initiates  a  diagnostic  procedure,  and 
subsequently  effects  a  machine-check  interruption.  A  machine 
check  is  occasioned  only  by  a  hardware  malfunction;  it  cannot 
be  caused  bv  invalid  data  or  instructions. 

Interrupt  priority 

Interruption  requests  are  honored  between  instniction  executions. 
When  several  requests  occur  during  execution  of  an  instniction, 
thev  are  honored  in  the  following  order:  (1)  machine  check,  (2) 
program  or  supervisor  call,  (3)  external,  and  (4)  input/output. 
Because  the  program  and  supervisor-call  interniptions  are  mutu- 
ally exclusive,  thev  cannot  occur  at  the  same  time. 

If  a  machine-check  interruption  occurs,  no  other  interruptions 
can  be  taken  until  this  interruption  is  fully  processed.  Otherwise, 
the  execution  of  the  CPU  program  is  delaved  while  PSW's  are 
appropriatelv  stored  and  fetched  for  each  interruption.  When  the 
last  interruption  request  has  been  honored,  instruction  execution 
is  resumed  with  the  PSW  last  fetched.  An  interruption  subroutine 
is  then  serviced  for  each  interniption  in  the  order  (1)  input/output, 
(2)  external,  and  (3)  program  or  supervisor  call. 

Program  status 

Overall  CPU  status  is  determined  by  four  alternatives:  (1)  stopped 
versus  operating  state,  (2)  running  versus  ivaiting  state,  (3)  masked 
versus  interruptable  state,  and  (4)  supervisor  versus  problem  state. 

In  the  stopped  state,  which  is  entered  and  left  by  manual 
procedure,  instnictions  are  not  executed,  interruptions  are  not 
accepted,  and  the  timer  is  not  updated.  In  the  operating  state, 
the  CPU  is  capable  of  executing  instnictions  and  of  being  inter- 
nipted. 

In  the  running  state,  instruction  fetching  and  execution  pro- 
ceeds in  the  normal  manner.  The  wait  state  is  tvpicallv  entered 


by  the  program  to  await  an  interruption,  for  example,  an  I/O 
interruption  or  operator  intervention  from  the  console.  In  the  wait 
state,  no  instnictions  are  processed,  the  timer  is  updated,  and  I/O 
and  external  interruptions  are  accepted  unless  masked.  Running 
versus  waiting  is  determined  bv  the  setting  of  a  bit  in  the  current 
PSW. 

The  CPL'  mav  be  interruptable  or  masked  for  the  svstem, 
program,  and  machine  interruptions.  When  the  CPL'  is  internipt- 
able  for  a  class  of  interruptions,  these  interruptions  are  accepted. 
When  the  CPU  is  masked,  the  system  interruptions  remain  pend- 
ing, but  the  program  and  machine-check  interniptions  are  ignored. 
The  interniptable  states  of  the  CPL'  are  changed  by  altering  mask 
bits  in  the  current  PSW. 

In  the  problem  state,  processing  instructions  are  valid,  but  all 
I/O  instructions  and  a  group  of  control  instnictions  are  invalid. 
In  the  supervisor  state,  all  instructions  are  valid.  The  choice  of 
problem  or  supervisor  state  is  determined  by  a  bit  in  the  PSW. 

Supervisory  facilities 

Timer 

.\  timer  word  in  main  storage  location  80  is  counted  down  at  a 
rate  of  50  or  60  cycles  per  second,  depending  on  power  line 
frequency.  The  word  is  treated  as  a  signed  integer  according  to 
the  rules  of  fixed-point  arithmetic.  An  external  internipt  occurs 
when  the  value  of  the  timer  word  goes  from  positive  to  negative. 
The  full  cycle  time  of  the  timer  is  15.5  hours. 

.\s  an  interval  timer,  the  timer  mav  be  used  to  measure  elapsed 
time  over  relatively  short  intervals.  The  timer  can  be  set  by  a 
supervisory-mode  program  to  any  value  at  any  time. 

Direct  control 

Two  instnictions,  READ  DIRECT  and  \VTiITE  DIRECT,  provide 
for  the  transfer  of  a  single  bvte  of  information  between  an  external 
device  and  the  main  storage  of  the  system.  These  instructions  are 
intended  for  use  in  synchronizing  CPU  s  and  special  external 
devices. 

Storage  protection 

For  protection  purposes,  main  storage  is  divided  into  blocks  of 
2,048  bvtes  each.  A  four-bit  storage  key  is  associated  with  each 
block.  When  a  store  operation  is  attempted  bv  an  instruction,  the 
protection  key  of  the  current  PSW  is  compared  with  the  storage 
kev  of  the  affected  block.  When  storing  is  specified  by  a  channel 
operation,  a  protection  key  supplied  by  the  channel  is  used  as  the 


598  Part  6  |  Computer  families 


Section  3  |  The  IBIVI  System/360— a  series  of  planned  macfiines  whicti  span  a  wide  performance  range 


comparand.  The  keys  are  said  to  match  if  equal  or  if  either  is  zero. 
A  storage  key  is  not  part  of  addressable  storage,  and  can  be 
changed  only  by  privileged  instructions.  The  protection  key  of  the 
CPU  program  is  held  in  the  current  PSW.  The  protection  key  of 
a  channel  is  recorded  in  a  status  word  that  is  associated  with  the 
channel  operation. 

When  a  CPU  operation  causes  a  protection  mismatch,  its 
execution  is  suppressed  or  terminated,  and  the  program  execution 
is  altered  by  an  interruption.  The  protected  storage  location 
always  remains  unchanged.  Similarly,  protection  mismatch  due  to 
an  I/O  operation  terminates  data  transmission  in  such  a  way  that 
the  protected  storage  location  remains  unchanged. 

Multisystem  operation 

Communication  between  CPU's  is  made  possible  by  shared  control 
units,  interconnected  channels,  or  shared  storage.  Multisystem 
operation  is  supported  by  provisions  for  automatic  relocation, 
indication  of  malfunctions,  and  CPU  initialization. 

Automatic  relocation  applies  to  the  first  4,096  bytes  of  storage, 
an  area  that  contains  all  permanent  storage  assignments  and 
usually  has  special  significance  for  supervisory  programs.  The 
relocation  is  accomplished  by  inserting  a  12-bit  prefix  in  each 
address  whose  high-order  12  bits  are  zero.  Two  manually  set 
prefixes  permit  the  use  of  an  alternate  area  when  storage  malfimc- 
tion  occurs;  the  choice  between  prefixes  is  preserved  in  a  trigger 
that  is  set  during  initial  program  loading. 

To  alert  one  CPU  to  the  possible  malfimction  of  another,  a 
machine-check  signal  from  a  given  CPU  can  serve  as  an  external 
interruption  to  another  CPU.  By  another  special  provision,  initial 
program  loading  of  a  given  CPU  can  be  initiated  by  a  signal  from 
another  CPU. 

Input/output 

Devices  and  control  units 

Input/output  devices  include  card  equipment,  magnetic  tape 
units,  disk  storage,  drum  storage,  typewriter-keyboard  devices, 
printers,  teleprocessing  devices,  and  process  control  equipment. 
The  I/O  devices  are  regulated  by  control  units,  which  provide 
the  electrical,  logical,  and  buffering  capabilities  necessary  for  I/O 
device  operation.  From  the  programming  point  of  view,  most 
control-unit  and  I/O  device  functions  are  indistinguishable. 
Sometimes  the  control  unit  is  housed  with  an  I/O  device,  as  in 
the  case  of  the  printer. 

A  control  unit  fimctions  only  with  those  I/O  devices  for  which 
it  is  designed,  but  all  control  units  respond  to  a  standard  set  of 


signals  ftom  the  channel.  This  control-unit-to-channel  connection, 
called  the  I/O  interface,  enables  the  CPU  to  handle  all  I/O 
operations  with  only  four  instructions. 

I/O  instructions 

Input/output  instructions  can  be  executed  only  while  the  CPU 
is  in  the  supervi.sor  state.  The  four  I/O  instructions  are  START 
I/O,  HALT  I/O,  TEST  CHANNEL,  and  TEST  I/O. 

START  I/O  initiates  an  I/O  operation;  its  address  field  speci- 
fies a  channel  and  an  I/O  device.  If  the  channel  facilities  are  free, 
the  instruction  is  accepted  and  the  CPU  continues  its  program. 
The  channel  independently  selects  the  specified  I/O  device.  HALT 
I/O  terminates  a  channel  operation.  TEST  CHANNEL  sets  the 
condition  code  in  the  PSW  to  indicate  the  state  of  the  channel 
addressed  by  the  instruction.  The  code  then  indicates  one  of  the 
following  conditions:  channel  available,  interruption  condition  in 
channel,  channel  working,  or  channel  not  operational.  TEST  I/O 
sets  the  PSW  condition  code  to  indicate  the  state  of  the  addressed 
channel,  .subchannel,  and  I/O  device. 

Channels 

Channels  provide  the  data  path  and  control  for  I/O  devices  as 
they  communicate  with  main  storage.  In  the  multiplexor  channel, 
the  single  data  path  can  be  time-shared  by  several  low-speed 
devices  (card  readers,  punches,  printers,  terminals,  etc.)  and  the 
channel  has  the  functional  character  of  many  subchannels,  each 
of  which  services  one  I/O  device  at  a  time.  On  the  other  hand, 
the  selector  channel,  which  is  designed  for  high-speed  devices,  has 
the  fimctional  character  of  a  single  subchannel.  All  subchannels 
respond  to  the  same  I/O  instructions.  Each  can  fetch  its  own 
control  word  sequence,  govern  the  transfer  of  data  and  control 
signals,  count  record  lengths,  and  interrupt  the  CPU  on  exceptions. 

Two  modes  of  operation,  burst  and  multiplex,  are  provided 
for  multiplexor  channels.  In  burst  mode,  the  channel  facilities  are 
monopolized  for  the  duration  of  data  transfer  to  or  from  a  particu- 
lar I/O  device.  The  selector  channel  fimctions  onlv  in  the  burst 
mode.  In  mviltiplex  mode,  the  multiplexor  channel  sustains  several 
simultaneous  I/O  operations:  bytes  of  data  are  interleaved  and 
then  routed  between  selected  I/O  devices  and  desired  locations 
in  main  storage. 

At  the  conclusion  of  an  operation  launched  by  START  I/O 
or  TEST  I/O,  an  I/O  interruption  occurs.  At  this  time  a  channel 
status  word  (CSW)  is  stored  in  location  64.  Figure  8  shows  the 
CSW  format.  The  CSW  provides  information  about  the  termina- 
tion of  the  I/O  operation. 

Successful  execution  of  START  I/O  causes  the  channel  to 


Chapter  43  |  The  structure  of  sysTEM/360  599 


0    0    0  0 


COMMAND  ADDRESS 


32  47  < 

Biti  0  through  3  contain  the  storage  protection  key  uted  in  the  operation. 
Bitl  4  through  7  contain  leros 

Bits  8  through  32  specify  the  location  of  the  last  CCW  used 
Bits  32  through  47  contain  an  I/O  device  status  byle  and  a  channel  statul 
byte  The  status  bytes  provide  such  inrormation  as  data  checK.  chang- 
ing Chech,  control  unit  end,  etc 
Bits  48  through  63  contain  the  residual  count  of  the  last  CCW  used. 


Fig.  8.  Channel  status  word  format. 


fetch  a  channel  acldres.s  word  from  main-storage  location  72.  Thi.s 
word  specifies  the  storage-protection  key  that  governs  the  I/O 
operation,  as  well  as  the  location  of  the  first  eight  bytes  of  infor- 
mation that  the  channel  fetches  from  main  storage.  These  64  bits 
comprise  a  channel  command  word  (CCW).  Figvire  9  shows  the 
CCW  format. 

Channel  program 

One  or  more  CCW's  make  up  the  channel  program  that  directs 
channel  operations.  Each  CCW  points  to  the  ne.xt  one  to  be 
fetched,  except  for  the  last  in  the  chain  which  so  identifies  itself. 

Six  channel  commands  are  provided:  read,  write,  read  back- 
ward, sense,  transfer  in  channel,  and  control.  The  read  command 
defines  an  area  in  main  storage  and  causes  a  read  operation  from 
the  selected  I/O  device.  The  write  command  causes  data  to  be 
written  by  the  selected  device.  The  read-backward  command  is 
akin  to  the  read  command,  but  the  external  medium  is  moved  in 
the  opposite  direction  and  bytes  read  backward  are  placed  in 
descending  main  storage  locations. 


The  control  command  contains  information,  called  an  order, 
that  is  used  to  control  the  selected  I/O  device.  Orders,  peculiar 
to  the  particular  I/O  device  in  use,  can  specify  such  functions 
as  rewinding  a  tape  unit,  searching  for  a  particular  track  in  disk 
storage,  or  line  skipping  on  a  printer.  In  a  functional  sense,  the 
C^PU  executes  I/O  instructions,  the  channels  execute  commands, 
and  the  control  units  and  devices  execute  orders. 

The  sense  command  specifies  a  main  storage  location  and 
transfers  one  or  more  bytes  of  status  information  from  the  selected 
control  unit.  It  provides  details  concerning  the  selected  I/O  de- 
vice, such  as  a  stacker-full  condition  of  a  card  reader  or  a  file- 
protected  condition  of  a  magnetic-tape  reel. 

A  channel  program  normally  obtains  CCW  s  from  a  consecu- 
tive string  of  storage  locations.  The  string  can  be  broken  by  a 
transfer-in-channel  command  that  specifies  the  location  of  the  next 
CCW  to  be  used  by  the  channel.  External  documents,  such  as 
punched  cards  or  magnetic  tape,  may  carry  CCW's  that  can  be 
used  bv  the  channel  to  govern  the  reading  of  the  documents. 

The  input/output  interruptions  caused  by  termination  of  an 


COMMAND  CODE 


DATA  ADDRESS 


32 


36  37 


39  40 


Bits  0  through  7  specify  the  command  code 

Bits  8  through  3 1  specify  the  location  of  a  byte  in  mam  storage. 

Bits  32  through  36  are  flag  bits 

Bit  32  causes  the  address  portion  of  the  next  CCW  to  be  used 
Bit  33  causes  the  command  code  and  data  address  in  the  ne 
CCW  to  be  used. 


Bit  34  causes  a  possible  inc 
Bit  35  suppresses  the  tr.iiis 
Bit  36  causes  an  interruptic 
Bits  37  through  39  must  contaii 
Bits  40  through  47  are  ignored 
Bits  48  through  63  specify  the  n 


■ngth  indication  to  be  suppressed 
formation  to  mam  stoiage. 


Tiber  of  bytes  m  the  opeiatic 


Fig.  9.  Channel  command  word  format. 


Table  2    System/360  instructions 


rrrt 

Branching  and 
status  switching 

OOOOxxxx 

Fixed-point  fullword 
and  logical 
OOOlxxxx 

Floating-poinl 
lon9 

OOlOxxxx 

Floating-point 
short 

OOUxxxx 

0000 

1  nari  DOQiTi\/r 
LUAU  rUol  1  ivt 

LPDR 

LOAD  POSITIVE 

LPER 

LOAD  POSITIVE 

0001 

1  ND 

LWMU  INtoAIIVt 

i"ti?d 

LOAD  NEGATIVE 

["t^d 

LUAU  [NtLiAllVL 

0010 

1  TD 

1  HAn  AWn  TFQT 
LUAU  AI\U    1  to  1 

1  r\nc\    AMPs  TCCT 

LUAU  AINU   1  Lb  1 

1  r>Ar\    AMR  TCCT 

LUAU  AINU   1  to  1 

1  PR 

1  HAn  PHMPI  F^^FNT 
LUAU  UUIvIr  LLIVlClN  1 

1  rno 

LUAU  UUivlrLLMLIN  1 

1  rPD 

LUAU  L-UMrLtlvltlN  1 

0100 

1    rNUUKAIVl  IVlAbr\ 

WD 

HDR 

HALVE 

UCD 

HALVE 

0101 

PAID 

DnAINLM  AINU  LIINrS 

ri  D 

UUIVlrAnt  LUUIUAL 

DPTD 

dWd 

DDAMi"i-i  r\w  r'rw  imt 
DKAINL,n  UIN  OUUIN  1 

no 

VD 

FYPI  1  ICIUF  DD 
tAULUol  Vt  UK 

1  nnn 

ecu 

ID 

kid»«rifti-.i- 

1001 

ISK 

(MCCDT    L/ CV 

1  INotn  1    rs  t  T 

CR 

rr^M  DA  DC 
UUM  rAn  t 

CDR 

COMPARl 

CER 

COMPARE 

1010 

SVC 

SUPERVISOR  CALL 

AR 

ADD 

ADR 

ADD  N 

ALR 

ADD  N 

CI  1  DTD  Ar'T 

oU D 1 KAU 1 

CI  1  DTD  Ar'T  M 
oUb  1  KAL  1  IN 

CI  IDTD  ft^T  M 
bUb  1  KAU  1  IN 

1100 

MR 

MULTIPLY 

MDR 

MULTIPLY 

MER 

MULTIPLY 

1101 

DR 

DIVIDE 

DDR 

DIVIDE 

DER 

DIVIDE 

1110 

ALR 

ADD  LOGICAL 

AWR 

ADD  U 

AUR 

ADD  U 

1111 

SLR 

SUBTRACT  LOGICAL 

SWR 

SUBTRACT  U 

SUR 

SUBTRACT  U 

Fixed-point  halfiivrd 
and  branching 


Fixed-point  fullword 
and  logical 


Floating-point 
long 


Floating-point 
short 


xxxx 

OlOOxxxx 

OWlxxxx 

OlIOxxxx 

OlUxxxx 

0000 

STH 

STORE 

ST 

STORE 

STD 

STORE 

STE 

STORE 

0001 

LA 

LOAD  ADDRESS 

0010 

STC 

STORE  CHARACTER 

0011 

IC 

INSERT  CHARACTER 

0100 

EX 

EXECUTE 

N 

AND 

0101 

BAL 

BRANCH  AND  LINK 

CL 

COMPARE  LOGICAL 

0110 

BCT 

BRANCH  ON  COUNT 

0 

OR 

0111 

BC 

BRANCH /CONDITION 

X 

EXCLUSIVE  OR 

1000 

LH 

LOAD 

L 

LOAD 

LD 

LOAD 

LE 

LOAD 

1001 

CH 

COMPARE 

PHMPA  DF 

UUIVIrAK  t 

1010 

AH 

ADD 

A 

ADD 

AD 

ADD  N 

AE 

ADD  N 

1011 

SH 

SUBTRACT 

S 

SUBTRACT 

SD 

SUBTRACT  N 

SE 

SUBTRACT  N 

1100 

MH 

MULTIPLY 

M 

MULTIPLY 

MD 

MULTIPLY 

ME 

MULTIPLY 

1101 

D 

DIVIDE 

DD 

DIVIDE 

DE 

DIVIDE 

1110 

CVD 

CONVERT-DECIMAL 

AL 

ADD  LOGICAL 

AW 

ADD  U 

AU 

ADD  U 

nil 

CVB 

CONVERTBINARY 

SL 

SUBTRACT  LOGICAL 

SW 

SUBTRACT  U 

SU 

SUBTRACT  U 

flS,  SI  Format 

Branching 

Fixed-point 

status  stvilchin^ 

logical  and 

and  shifting 

input/output 

inOOxxxx 

lOOlxxxx 

/0)(h-v.vv 

lOllxxxi 

0000 

SSM 

SET  SYSTEM  MASK 

STM 

STORE  MULTIPLE 

0001 

TM 

TEST  UNDER  MASK 

0010 

LPSW 

LOAD  PSW 

MVI 

MOVE 

0011 

DIAGNOSE 

TS 

TEST  AND  SET 

0100 

WRD 

WRITE  DIRECT 

Nl 

AND 

0101 

RDD 

READ  DIRECT 

CLI 

COMPARE  LOGICAL 

0110 

BXH 

BRANCH/HIGH 

01 

OR 

0111 

BXLE 

BRANCH  LOW-EQUAL 

XI 

EXCLUSIVE  OR 

1000 

SRL 

SHIFT  RIGHT  SL 

LM 

LOAD  MULTIPLE 

1001 

SLL 

SHIFT  LEFT  SL 

1010 

SRA 

SHIFT  RIGHT  S 

1011 

SLA 

SHIFT  LEFT  S 

1100 

SRDL 

SHIFT  RIGHT  DL 

SIC 

START  1  0 

1101 

SLDL 

SHIFT  LEFT  DL 

TIO 

TEST  1  0 

1110 

SRDA 

SHIFT  RIGHT  D 

HIO 

HALT  1  0 

1111 

SLDA 

SHIFT  LEFT  D 

TCH 

TEST  CHANNEL 

Logical 
llOlxxxx 


Decin 
lUlr. 


0000 

0001 

MVN 

0010 

MVC 

0011 

MVZ 

0100 

NC 

0101 

CLC 

0110 

OC 

0111 

XC 

1000 

1001 

1010 

1011 

1100 

TR 

1101 

TRT 

1110 

ED 

nil 

EDMK 

MOVE  NUMERIC 
MOVE 

MOVE  ZONE 
AND 

COMPARE  LOGICAL 
OR 

EXCLUSIVE  OR 


TRANSLATE 
TRANSLATE  AND  TEST 
EDIT 

EDIT  AND  MARK 


MVO      MOVE  WITH  OFFSET 
PACK  PACK 
UNPK  UNPACK 


ZAP 
CP 


ZERO  AND  ADD 

COMPARE 

ADD 

SUBTRACT 
MULTIPLY 
DIVIDE 


NOTE:   N  =  NORMALIZED  DL  =  DOUBLE  LOGICAL  S  =  SINGLE 

SL  =  SINGLE  LOGICAL  U  =  UNNORMALIZED  D  =  DOUBLE 


Chapter  43  |  The  structure  of  sySTEivi/360  601 


I/O  operation,  or  by  operator  intervention  at  the  I/O  device, 
enable  the  CPU  to  provide  appropriate  programmed  response  to 
conditions  as  they  occur  in  I/O  devices  or  channels.  Conditions 
responsible  for  I/O  interruption  requests  are  preserved  in  the  I/O 
devices  or  channels  until  recognized  by  the  CPU. 

During  execution  of  START  I/O,  a  command  can  be  rejected 
bv  a  busv  condition,  program  check,  etc.  Rejection  is  indicated 
in  the  condition  code  of  the  PSW,  and  additional  detail  on  the 
conditions  that  precluded  initiation  of  the  I/O  operation  is  pro- 
vided in  a  CSW. 

Manual  control 

The  need  for  manual  control  is  minimal  because  of  the  design  of 
the  system  and  supervisory  program.  A  control  panel  provides  the 


ability  to  reset  the  system;  store  and  display  information  in  main 
storage,  in  registers,  and  in  the  PSW;  and  load  initial  program 
information.  After  an  input  device  is  selected  with  the  load  unit 
switches,  depressing  a  load  key  causes  a  read  from  the  selected 
input  device.  The  six  words  of  information  that  are  read  into  main 
storage  provide  the  PSW  and  the  CCW's  required  for  subsequent 
operation. 

Instruction  set 

The  system/360  instructions,  classified  by  format  and  function, 
are  displayed  in  Table  2.  Operation  codes  and  mnemonic  abbrevi- 
ations are  also  shown.  With  the  previously  described  formats  in 
mind,  much  of  the  generality  provided  by  the  system  is  apparent 
in  this  listing. 


Chapter  44 


The  structure  of  system/SSO^ 

Part  II— System  Implementations 

W.  Y.  Stevens 

Summary  The  performance  range  desired  of  sysTEM/360  is  obtained  by 
variations  in  the  storage,  processing,  control,  and  channel  functions  of  the 
several  models.  The  systematic  variations  in  speed,  size,  and  degree  of 
simultaneity  that  characterize  the  fimctional  components  and  elements  of 
each  model  are  discussed. 


A  primary  goal  in  the  system/360  design  effort  was  a  wide  range 
of  processing  unit  performances  coupled  with  complete  program 
compatibility.  In  keeping  with  this  goal,  the  logical  structure  of 
the  resultant  system  lends  itself  to  a  wide  choice  of  components 
and  techniques  in  the  engineering  of  models  for  desired  perform- 
ance levels. 

This  paper  discusses  basic  choices  made  in  implementing  six 
SYSTEM/.360  models  spanning  a  performance  range  of  fifty  to  one. 
It  should  be  emphasized  that  the  problems  of  model  implementa- 
tion were  studied  throughout  the  design  period,  and  many  of  the 
decisions  concerning  logical  structure  were  influenced  by  difficul- 
ties anticipated  or  encountered  in  implementation. 

Performance  adjustment 

The  choices  made  in  arriving  at  the  desired  performances  fall  into 
four  areas: 

Main  storage 

Central  processing  unit  (CPU)  registers  and  data  paths 
Sequence  control 
Input/output  (I/O)  channels 

Each  of  the  adjustable  parameters  of  these  areas  can  be  subordi- 
nated, for  present  purposes,  to  one  of  three  general  factors:  basic 
speed,  size,  and  degree  of  simultaneity. 

'/BM  S(/,s.  J.  vol.  3,  no.  2,  1,36-143,  1964. 


Main  storage 

Storage  speed  and  size 

The  interaction  of  the  general  factors  is  most  obvious  in  the  area 
of  main  storage.  Here  the  basic  speeds  vary  over  a  relatively  small 
range:  from  a  2.5-jasec  cycle  for  the  Model  40  to  a  1.0-|Lisec  cycle 
for  Models  62  and  70.  However,  in  combination  with  the  other 
two  factors,  a  32:1  range  in  overall  storage  data  rate  is  obtained, 
as  shown  in  Table  I. 

Most  important  of  the  three  factors  is  size.  The  width  of  main 
storage,  i.e.,  the  amount  of  data  obtained  with  one  storage  access, 
ranges  from  one  byte  for  the  Model  30,  two  bytes  for  the  Model 
40,  and  four  bytes  for  the  Model  50,  to  8  bytes  for  Models  60, 
62,  and  70. 

Another  size  factor,  less  direct  in  its  effect,  is  the  total  number 
of  bytes  in  main  storage,  which  can  make  a  large  difference  in 
system  throughput  by  reducing  the  number  of  references  to  exter- 
nal storage  media.  This  number  ranges  from  a  minimum  of  8192 
bytes  on  Model  .30  to  a  maximum  of  .524,288  bytes  on  Models  60, 
62,  and  70.  An  option  of  up  to  eight  million  more  bytes  of  slower- 
speed,  large-capacity  core  storage  can  fiuther  increase  the 
throughput  in  some  applications. 

Interleaved  storage 

Simultaneity  in  the  core  storage  of  Models  60  and  70  is  obtained 
by  overlapping  the  cycles  of  two  storage  units.  Addresses  are 
staggered  in  the  two  units,  and  a  series  of  requests  for  successive 
words  activates  the  two  units  alternately,  thus  doubling  the 
maximum  rate.  For  increased  system  performance,  this  technique 
is  less  effective  than  doubling  the  basic  speed  of  a  single  unit,  since 
the  access  time  to  a  single  word  is  not  improved,  and  successive 
references  frequently  occur  to  the  same  unit.  This  is  illustrated 
by  comparing  the  performances  of  Models  60  and  62,  whose  only 
difference  is  the  choice  between  two  overlapped  2.0-jusec  storage 
units  and  one  single  1.0-|iisec  storage  unit,  respectively.  The  per- 
formance of  Model  62  is  approxiniatelv  1.5  times  that  of  Model  60. 


Chapter  44  j  The  structure  of  system/360  603 


Table  1    System/360  main  storage  characteristics 


Model 

Model 

Model 

Model 

Model 

Model 

30 

40 

50 

fid 

62 

TO 

Cycle  time  (usee) 

2.0 

2.5 

2.0 

2.0 

1.0 

1.0 

Width  (bytes) 

1 

2 

4 

8 

8 

8 

Interleaved  access 

no 

no 

no 

yes 

no 

yes 

Maximum  data  rate  (bytes/;jsec) 

0.5 

0.8 

2.0 

8.0 

8.0 

16.0 

Minimum  storage  size  (bytes) 

8,192 

16.384 

65.536 

131,072 

262,144 

262.144 

Maximum  storage  size  (bytes) 

65,536 

262,144 

262,144 

524,288 

524.288 

524,288 

Large  capacity  storage  attachable 

no 

no 

yes 

yes 

yes 

yes 

CPU  registers  and  data  paths 

Circuit  speed 

system/36()  has  three  families  of  logic  circviits,  as  shown  in  Table 
2,  each  using  the  same  solid-logic  technology.  One  family,  having 
a  nominal  tielav  of  .30  nsec  per  logical  stage  or  level,  is  used  in 
the  data  paths  of  Models  30,  40,  and  50.  A  second  and  faster  family 
with  a  nominal  delav  of  10  nsec  per  level  is  used  in  Models  60 
and  62.  The  fastest  family,  with  a  dela\  of  6  nsec,  is  used  in  .Model 
70. 

The  fundamental  determinant  of  CPU  speed  is  the  time  re- 
quired to  take  data  from  the  internal  registers,  process  the  data 
through  the  adder  or  other  logical  unit,  and  return  the  result  to 
a  register.  This  cvcle  time  is  determined  by  the  delay  per  logical 


circuit  level  and  the  number  of  levels  in  the  register-to-adder  path, 
the  adder,  and  the  adder-to-register  return  path.  The  number  of 
levels  varies  because  of  the  trade-off  that  can  usually  be  made 
between  the  number  of  circuit  modules  and  the  number  of  logical 
levels.  Thus,  the  cycle  time  of  the  system  varies  from  1.0  /usee  for 
.Model  30  (with  30-nsec  circuits,  a  relatively  small  number  of 
modules,  and  more  logic  levels)  and  0.5  /xsec  for  Model  .50  (also 
with  .30-nsec  circuits,  but  with  more  modules  and  fewer  levels) 
to  0.2  fisec  for  Model  70  (with  6-nsec  circuits). 

Local  storage 

The  speed  of  the  CPU  depends  also  on  the  speed  of  the  general 
and  floating-point  registers.  In  Model  .30,  these  registers  are  located 
in  an  extension  to  the  main  core  storage  and  have  a  read-write 


Table  2    System/360  CPU  characteristics 


Model 

Model 

Model 

Model 

Model 

30 

4(1 

50 

60/62 

70 

Circuit  family:  nominal  delay  per  logic  level  (nsec) 

30 

30 

30 

10 

6 

Cycle  time  (fisec) 

1.0 

0.625 

0.5 

0.25 

0.2 

Location  of  general  and  floating  registers 

main 

local 

local 

local 

transistor 

core 

core 

core 

transistor 

registers 

storage 

storage 

storage 

storage 

Width  of  general  and  floating  register  storage  (bytes) 

1 

2 

4 

4 

4  or  8 

Speed  of  general  and  floating  register  storage  (usee) 

2.0 

1.25 

0.5 

0.25 

Width  of  mam  adder  path  (bits) 

8 

8 

32 

56 

64 

Width  of  auxiliary  transfer  path  (bits) 

16 

8 

Widths  of  auxiliary  adder  paths  (bits) 

8 

8.  8.  and  24 

Approximate  number  of  bytes  of  register  storage 

12 

15 

30 

50 

100 

Approximate  number  of  bytes  of  working  locations  in  local 

45 

48 

60 

4 

storage 

(main 
storage) 

Relative  computing  speed 

1 

3.5 

10 

21  30 

50 

Part  6  I  Computer  families 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


time  of  2.0  jasee.  In  Model  40,  the  registers  are  located  in  a  small 
core-storage  unit,  called  local  storage,  with  a  read-write  time  of 
1.25  fisec.  Here,  the  operation  of  the  local  storage  may  be  over- 
lapped with  main  storage.  In  Model  50,  the  registers  are  in  a  local 
storage  with  a  read-write  time  of  only  0.5  /isec.  In  Model  60/62, 
the  local  storage  has  the  logical  characteristics  of  a  core  storage 
with  nondestructive  read-out;  however,  it  is  actually  constrvicted 
as  an  array  of  registers  using  the  30-nsec  family  of  logic  circuits, 
and  has  a  read-write  time  of  0.25  jasec.  In  Model  70,  the  general 
and  floating-point  registers  are  implemented  with  6-nsec  logic 
circuits  and  communicate  directly  with  the  adder  and  other  data 
paths. 

The  two  principal  measures  of  size  in  the  CPU  are  the  width 
of  the  data  paths  and  the  number  of  bytes  of  high-speed  working 
registers. 

Data  path  organization 

Model  30  has  an  8-bit  wide  (plus  parity)  adder  path,  through  which 
all  data  transfers  are  made,  and  approximately  12  bytes  of  working 
registers. 

Model  40  also  has  an  8-bit  wide  adder  path,  but  has  an  addi- 
tional 16-bit  wide  data  transfer  path.  Approximately  15  bytes  of 
working  registers  are  used,  plus  about  48  bytes  of  working  locations 
in  the  local  storage,  exclusive  of  the  general  and  floating-point 
registers. 

Model  50  has  a  32-bit  wide  adder  path,  an  8-bit  wide  data  path 
used  for  handling  individual  bytes,  approximately  30  bytes  of 
working  registers,  plus  about  60  bytes  of  working  locations  in  the 
local  storage. 

Model  60/62  has  a  56-bit  wide  main  adder  path,  an  8-bit  wide 
serial  adder  path,  and  approximately  50  bytes  of  working  registers. 

Model  70  has  a  64-bit  wide  main  adder,  an  8-bit  wide  exponent 
adder,  an  8-bit  wide  decimal  adder,  a  24-bit  wide  addressing  adder, 
and  several  other  data  transfer  paths,  some  of  which  have  incre- 
menting ability.  The  model  has  about  100  bytes  of  working  registers 
plus  the  96  bvtes  of  floating  point  and  general  registers  which,  in 
Model  70,  are  directly  associated  with  the  data  paths. 

The  models  of  system/360  diff^er  considerably  in  the  number 
of  relatively  independent  operations  that  can  occur  simultaneously 
in  the  CPU.  Model  30,  for  example,  operates  serially:  virtually  all 
data  transfers  must  pass  through  the  adder,  one  byte  at  a  time. 
Model  70,  however,  can  have  many  operations  taking  place  at  the 
same  time.  The  CPU  of  this  model  is  divided  into  three  units  that 
operate  somewhat  independently.  The  instruction  preparation  imit 
fetches  instnictions  from  storage,  prepares  them  by  computing 
their  effective  addresses,  and  initiates  the  fetching  of  the  required 
data.  The  execution  unit  performs  the  execution  of  the  instruction 


prepared  by  the  instruction  unit.  The  third  unit  is  a  storage  bus 
control  which  coordinates  the  various  requests  by  the  other  units 
and  by  the  channels  for  core-storage  cycles.  All  three  units  nor- 
mally operate  simultaneously,  and  together  provide  a  large  degree 
of  instruction  overlap.  Since  each  of  the  units  contains  a  number 
of  different  data  paths,  several  data  transfers  may  be  occurring 
on  the  same  cycle  in  a  single  unit. 

The  operations  of  other  system/360  models  fall  between  those 
mentioned.  Model  50,  for  example,  can  have  simultaneous  data 
transfers  through  the  main  adder,  through  an  auxiliary  byte  trans- 
fer path,  and  to  or  from  local  storage. 

Sequence  control 

Complex  instruction  sequences 

Since  the  system/360  has  an  extensive  instruction  set,  the  CPU's 
must  be  capable  of  executing  a  large  mnnber  of  different  sequences 
of  basic  operations.  Furthermore,  many  instructions  require  se- 
quences that  are  dependent  on  the  data  or  addresses  used.  As 
shown  in  Table  3,  these  sequences  of  operations  can  be  controlled 
by  two  methods;  either  by  a  conventional  sequential  logic  circuit 
that  uses  the  same  types  of  circuit  modules  as  used  in  the  data 
paths  or  by  a  read-only  storage  device  that  contains  a  micro- 
program specifying  the  sequences  to  be  performed  for  the  different 
instnictions. 

Model  70  makes  use  of  conventional  sequential  logic  control 
mainly  because  of  the  high  degree  of  simultaneity  required.  Also, 
a  sufficiently  fast  read-only  storage  unit  was  not  available  at  the 
time  of  development.  The  sequences  to  be  performed  in  each  of 
the  Model  70  data  paths  have  a  considerable  degree  of  independ- 
ence. The  read-only  storage  method  of  control  does  not  easily  lend 
itself  to  controlling  these  independent  sequences,  but  is  well 
adapted  where  the  actions  in  each  of  the  data  paths  are  highly 
coordinated. 

Read-only  storage  control 

The  read-only  storage  method  of  control  is  described  elsewhere 
[Peacock,  19??].  This  microprogram  control,  used  in  all  but  the 
fastest  model  of  system/360,  is  the  only  method  known  by  which 
an  extensive  instruction  set  may  be  economically  realized  in  a 
small  system.  This  was  demonstrated  during  the  design  of  Model 
60/62.  Conventional  logic  control  was  originally  planned  for  this 
model,  but  it  became  evident  during  the  design  period  that  too 
many  circuit  modules  were  required  to  implement  the  instniction 
set,  even  for  this  rather  large  system.  Because  a  sufficiently  fast 
read-only  storage  became  available,  it  was  adopted  for  sequence 
control  at  a  substantial  cost  reduction. 


Chapter  44  [  The  structure  of  system  360  605 


Table  3    System/360  sequence  control  characteristics 


Model 

Model 

Model 

Model 

30 

40 

50 

60  62 

70 

Type 

read-only 

readonly 

readonly 

read-only 

sequentia 

storage 

storage 

storage 

storage 

logic 

Cycle  time  (/isec) 

1.0 

0.625 

0.5 

0.25 

0.2 

Width  of  read  only  storage  word  (available  bits) 

60 

60 

90 

100 

Number  of  read-only  storage  words  available 

4096 

4096 

2816 

2816 

Number  of  gate-control  fields  in  read-only  storage 

word 

9 

10 

15 

16 

The  three  factors  of  speed,  size,  and  simultaneity  are  applicable 
to  the  read-only  storage  controLs  of  the  various  system/360  models. 
The  speed  of  the  read-only  storage  units  corresponds  to  the  cycle 
time  of  the  CPU,  and  hence  varies  from  1.0  jusec  per  access  for 
Model  30  down  to  0.25  jiisec  for  Models  60  and  62. 

The  size  of  read-only  storage  can  vary  in  two  ways — in  \\  idth 
(number  of  bits  per  word)  and  in  number  of  words.  Since  the  bits 
of  a  word  are  used  to  control  gates  in  the  data  paths,  the  width 
of  storage  is  indirectly  related  to  the  comple,\ity  of  the  data  paths. 
The  widths  of  the  read-only  storages  in  system/360  range  from 
60  bits  for  Models  30  and  40  to  100  bits  for  ,Vlodels  60  and  62. 
The  number  of  words  is  affected  by  several  factors.  First,  of  course, 
is  the  number  and  comple.vitv  of  the  control  sequences  to  be 
executed.  This  is  the  same  for  all  models  e.\cept  that  Model  60  62 
read-onlv  storage  contains  no  sequences  for  channel  fimctions.  The 
number  of  words  tends  to  be  greater  for  the  smaller  models,  since 
these  models  require  more  cycles  to  accomplish  the  same  fimction. 
Partiallv  offsetting  this  is  the  fact  that  the  greater  degree  of 
simultaneity  in  the  larger  systems  often  prevents  the  sharing  of 
microprogram  sequences  between  similar  fimctions. 

SYSTEM,  360  employs  no  read-only  storage  simultaneity  in  the 
sense  that  more  than  one  access  is  in  progress  at  a  given  time. 
However,  a  single  read-onlv  storage  word  simultaneously  controls 
several  independent  actions.  The  number  of  different  gate  control 
fields  in  a  word  provides  some  measure  of  this  simultaneit) .  Model 
30  has  9  such  fields.  Model  60/62  has  16. 

Input/output  channels 

Channel  design 

The  system/360  input/output  channels  may  be  considered  from 
two  viewpoints:  the  design  of  a  channel  itself,  or  the  relationship 
of  a  channel  to  the  whole  system. 

From  the  viewpoint  of  channel  design,  the  raw  speed  of  the 
components  does  not  vary,  since  all  channels  use  the  30-nsec  family 
of  circuits.  However,  the  different  channels  do  have  access  to 


different  speeds  of  main  storage  and,  in  the  three  smaller  models, 
different  speeds  of  local  storage. 

The  channels  differ  markedly  in  the  amount  of  hardware  de- 
voted exclusively  to  channel  use,  as  shown  in  Table  4.  In  the  Model 
.30  multiplexor  channel,  this  hardware  amounts  only  to  three 
1-bvte  wide  data  paths,  11  latch  bits  for  control,  and  a  simple 
interface  polling  circuit.  The  channel  used  in  Models  60,  62, 
and  70  contains  about  300  bits  of  register  storage,  a  24-bit  wide 
adder,  and  a  complete  set  of  sequential  control  circuits.  The 
amount  of  hardware  provided  for  other  channels  is  somewhere  in 
between  these  extremes. 

The  disparity  in  the  amount  of  channel  hardware  reflects  the 
extent  to  which  the  channels  share  CPU  hardware  in  accomplish- 
ing their  fimctions.  Such  sharing  is  done  at  the  expense  of  increased 
interference  with  the  CPU,  of  course.  This  interference  ranges 
from  complete  lock-out  of  CPU  operations  at  high  data  rates  on 
some  of  die  smaller  models,  to  interference  only  in  essential 
references  to  main  storage  by  the  channel  in  the  large  models. 

Channel  system  relationship  ,  •-  • 

W  hen  the  channels  are  viewed  in  their  relationship  to  the  whole 
system,  the  three  factors  of  speed,  size,  and  simultaneit\  take  on 
a  different  aspect.  The  channel  is  viewed  as  a  system  component, 
and  its  effect  on  system  throughput  and  other  system  capabilities 
is  of  concern.  The  speeds  of  the  channels  vary  from  a  maximum 
rate  of  about  16  thousand  bytes  per  second  (byte  interleaved  mode) 
on  the  multiplexor  channel  of  Model  30  to  a  maximum  rate  of 
about  12.50  thousand  bvtes  per  second  on  the  channels  of  Models 
60,  62,  and  70.  The  size  of  each  of  the  channels  is  the  same,  in 
the  sense  that  each  handles  an  8-bit  byte  at  a  time  and  each  can 
connect  to  eight  different  control  units.  A  slight  size  difference 
exists  among  multiplexor  channels  in  terms  of  the  maximum  num- 
ber of  subchannels. 

The  degree  of  channel  simultaneity  differs  considerably  among 
the  various  models  of  system/360.  For  example,  operation  of  the 
Model  30  or  40  multiplexor  channels  in  burst  mode  inhibits  all 


Part  6  I  Computer  families 


Section  3  |  The  IBM  System/360— a  series  of  planned  machines  which  span  a  wide  performance  range 


Table  4    System/360  channel  characteristics 


Model  Model  Model  Model  Model 

30  40  50  60/62  70 


Selector  channels 

Maximum  number  attachable  2                      2  3  6  6 

Approximate  maximum  data  rate  on  one  channel  in  250                  400  800  1250  1250 

Kbypst  (1250  on 

high  speed) 

Uses  CPU  data  paths  for; 

initiation  and  termination  yes                   yes  yes  yes  yes 

byte  transfers  no                     no  no  no  nc 

storage  word  transfers  no                low  speed  yes  no  no 

only 

chaining  yes                    yes  yes  no  no 

CPU  and  I/O  overlap  possible  yes                    yes  regular— yes  yes  yes 

high  speed— no 

Multiplexor  clianneh 

Maximum  number  attachable  11  1  0  0 

Minimum  number  of  subchannels  32                     16  64 

Maximum  number  of  subchannels  96                    128  256 

Maximum  data  rate  in  byte  interleaved  mode  (Kbyps)  16                     30  40 

Maximum  data  rate  in  burst  mode  (Kbyps)  200                   200  200 

Uses  CPU  data  paths  for  all  functions  yes                    yes  yes 

CPU  and  I/O  overlap  possible  in  byte  mode  yes                    yes  yes 

CPU  and  I/O  overlap  possible  in  burst  mode  no                     no  yes 

t  Thousand  bytes  per  second. 


other  activity  on  the  system,  as  does  operation  of  the  special 
high-speed  channel  on  Model  50.  At  the  other  extreme,  as  many 
as  six  selector  channels  can  be  operating  concurrently  with  the 
CPU  on  Models  60,  62,  or  70.  A  second  type  of  simultaneity  is 
present  in  the  multiplexor  channels  available  on  Models  30,  40, 
and  .50.  When  operating  in  byte  interleaved  mode,  one  of  these 
channels  can  control  a  number  of  concurrently  operating  input/ 
output  devices,  and  the  CPU  can  also  continue  operation. 

Differences  in  application  emphasis 

The  models  of  system/360  differ  not  only  in  throughput  but  also 
in  the  relative  speeds  of  the  various  operations.  Some  of  these 
relative  differences  are  simply  a  result  of  the  design  choices  de- 
scribed in  this  paper,  made  to  achieve  the  desired  overall  perform- 
ance. The  more  basic  differences  in  relative  performance  of  the 
various  operations,  however,  were  intentional.  These  differences 
in  emphasis  suit  each  model  to  those  applications  expected  to 
comprise  its  largest  usage. 

Thus  the  smallest  system  is  particularly  aimed  at  traditional 
commercial  data  processing  applications.  These  are  characterized 
by  extensive  input/output  operations  in  relation  to  the  internal 
processing,  and  by  more  character  handling  than  arithmetic.  The 


fast  selector  channels  and  character-oriented  data  paths  of  Model 
.30  result  from  this  emphasis.  But  despite  this  emphasis,  the  gen- 
eral-purpose instruction  set  of  system/360  results  in  much  better 
scientific  application  performance  for  Model  30  than  for  its  com- 
parable predecessors. 

On  the  other  hand,  the  large  systems  are  expected  to  find 
particularly  heavy  use  in  scientific  computation,  where  the  em- 
phasis is  on  rapid  floating-point  arithmetic.  Thus  Models  60,  62, 
and  70  contain  registers  and  adders  that  can  handle  the  full  length 
of  a  long  format  floating-point  operand,  yet  do  character  opera- 
tions one  byte  at  a  time. 

No  particular  emphasis  on  either  commercial  or  scientific 
applications  characterizes  the  intermediate  models.  However, 
Models  40  and  50  are  intended  to  be  particularly  suitable  for 
communication-oriented  and  real-time  applications.  For  example. 
Model  50  includes  a  multiplexor  channel,  storage  protection,  and 
a  timer  as  standard  features,  and  also  provides  the  ability  to  share 
main  storages  between  two  CPU's  in  a  multiprocessing  arrange- 
ment. 

References 

PeacA?? 


Appendix 

PMS  and  ISP  notations 


This  appendix  provides  complete  definitions  of  the  notations  used  for  the 
PMS  and  ISP  descriptions.  It  is  intended  to  supplement  Chap.  2.  which 
provides  an  informal  description  of  the  notations  along  with  some  comments 
on  motivation  and  underlying  rationale. 

The  two  descriptive  systems  are  consistent  with  each  other  in  two 
senses.  First,  certain  general  conventions  that  have  to  do  with  forming 
expressions  and  abbreviating  apply  to  both  systems.  Second,  the  values  of 
certain  PMS  attributes  are  describable  in  ISP  but  not  in  PMS.  A  complete 
"top  down"  development  would  thus  embed  ISP  within  PMS.  Neverthe- 
less, it  appears  appropriate  to  present  them  as  two  distinct  notations:  it 
makes  reference  easier  and  permits  each  to  be  organized  around  its  own 
most  important  notions. 

The  style  of  presentation  is  moderately  formal.  Within  a  section,  the 
.syntax  is  presented,  followed  by  remarks  on  the  interpretation  to  be  given 
to  these  syntactic  forms  (the  semantics).  Examples  that  help  to  pin  down 
the  notations  are  furnished  throughout.  .Although  not  a  computer  lan- 
guage, we  present  it  as  if  it  were;  thus,  a  number  of  elementary  things 
are  provided  for  in  the  definitions.  (Part  of  the  motivation  for  this  is  to 
introduce  abbreviations.) 

A  language  can  be  realized  in  nianv  media.  In  this  book  we  have  taken 
some  advantage  of  printing  orthographv  insofar  as  it  enhances  communi- 
cation. However,  it  may  also  be  necessary  to  map  the  notations  into  vari- 
ous restrictive  character  sets — e.g.,  those  of  the  typewriter  and  the  com- 
puter. For  the  sake  of  brevity,  we  do  not  discuss  this  coding  problem  here. 

The  appendix  is  in  three  parts.  The  first  part  gives  the  general  con- 
ventions common  to  both  PMS  and  ISP.  The  second  and  third  parts  give 
PMS  (page  615)  and  ISP  (page  628),  as  discussed  in  Chap.  2. 

General  conventions 

The  conventions  given  in  this  section  define  the  general  nature  of  the 
syntax  and  semantics  of  both  PMS  and  ISP. 

These  general  conventions  parallel  closely  natural  usage  by  technically 
trained  people  familiar  with  programming  languages,  such  as  ."VLCOL. 
There  is  no  need  to  consult  these  sections  if  the  brief  statements  and  il- 
lustrations follow  ing  each  subsection  title  are  clearly  understood. 

1  Basic  semantics 

The  language  can  refer  to  any  entities  that  are  given  by  attributes 
and  values. 

2  Metanotation 

(There  is  no  need  for  metanotation  unless  general  conventions  are 
to  be  read  in  detail.) 


3  Basic  svntax 

Expressions  are  built  up  from  subexpressions  and  ultimately  from 
names.  Parentheses  are  used  to  avoid  ambiguity. 

4  Commands:  assignments,  abbreviations,  variables,  forms 

x  :  =  y  assigns  the  name  x  to  mean  the  same  as  the  expression  y. 
X  /  y  establishes  the  name  y  as  an  abbreviation  or  alternative 
name  (alias)  for  x. 

X  —  y  :  =  min(x  —  y,  0)  defines  a  new  binarv  operation  {^)  by 
means  of  a  form  in  the  variables  x  and  y. 

-5    Indefinite  expressions 

a  I  b  I  c  means  one  of  a  or  b  or  c. 

X  —  y  means  the  interval  from  x  up  to  and  including  y. 
^x  means  an  interval  around  x  of  undetermined  scope. 

6  Lists  and  sets 

(3,  5,  I,  5)  is  a  list  of  digits,  which  also  could  have  been  written 
(3;  .5;  1;  .5).  Digit-list  refers  to  all  possible  lists  of  digits.  Digit-set 
refers  to  all  possible  sets  digits,  unordered  and  without  repetition. 

7  Definite  expressions 

X:  =  I  size:  integer;  function:  (primary  |  secondary);  control:  (yes| 
nol)  defines  X  to  be  an  entity  with  an  attribute,  size,  taking  any 
integer  as  value;  with  an  attribute,  function,  taking  primary  or 
secondary  as  value;  and  with  an  attribute,  control,  taking  yes  or 
no  as  value. 

Y  :  =  X(size;  12  ~-  20;  primary;  — |Control)  defines  Y  as  an  entity  of 
type  X  which  is  further  specified  by  having  size  between  12  and 
20,  having  the  value  of  fimction  be  primary  and  the  value  of 
control  be  no. 

8  Attributes 

3:Z  is  the  third  item  on  the  list  Z;  —  1:Z  is  the  last  item,  (add- 
time,  store-time)  can  be  an  attribute  and  then  has  values  such  as 
(10  jjs.  6  fis). 

9  Null  symbol  and  optional  expressions 

0  is  the  null  symbol  so  that  (x,  0,  v)  is  the  same  as  (x,  y).  *x  means 
that  X  is  optional;  defined  as  (x|0) 

10  Names 

Simple-names  are  strings  of  letters  and  digits,  permitting  concate- 
nation with  the  space  (_)  and  the  hyphen  (-).  'The„big„instruc- 
tion-set'  is  a  simple-name. 

Memory. primary  is  a  compoimd  name,  which  is  an  abbreviation 
for  Memory(primary). 

Classes  of  names  can  be  constructed  and  assigned  to  be  used  for 
various  entities — if  for  an  entity,  X,  then  called  X-names. 


607 


608  Appendix 


11  Numbers 

Numbers  and  arithmetic  expressions  are  defined  in  the  standard 
fashion. 

12  Quantities,  dimensions,  and  units 

A  quantity  is  just  a  dimensionalized  number — a  number  of  units 
along  a  given  dimension. 

13  Booleans  and  relations 

Logical  expressions  involving  and  (A),  or  (V),  not  (— |),  implies 
(D),  equivalence  (=),  and  exclusive-or  (@)  are  defined  in  stand- 
ard fashion,  as  are  expressions  involving  the  six  basic  relations 

(  =  ,^,  <,  >,  <,  >). 

1.    Basic  semantics 

1.1  We  will  use  the  term  "entity"  to  refer  to  all  things  designatable  by 
expressions  in  the  language. 

1.2  An  entity  is  assimred  to  be  fully  characterizable  by  a  set  of  attributes 
and  associated  values,  which  are  themselves  entities. 

COMMENT  There  will  necessarily  be  entities  with  no  further  specification 
within  the  system — that,  in  effect,  have  only  a  name. 

The  semantics  of  the  language  consists  in  showing  how  expressions  in  the 
language  determine  the  various  attributes  and  values. 

1.3  There  are  three  types  of  expressions. 

1  A  definite  expression  designates  an  entity. 

2  An  indefinite  expression  defines  a  class  of  definite  expressions;  it 
designates  one  of  the  entities  designated  by  members  of  this  class. 

3  A  command  designates  the  establishment  of  some  purely  linguistic 
convention. 

EXAMPLES    'IBM  7090       is  a  definite  expression. 

Mp  is  an  indefinite  expression  (any  primary  mem- 

ory). 

SAM  :=  Mp    is  a  command  to  give  the  name  SAM  to  an  Mp. 

1.4  There  are  also  English  language  comments,  which  are  connected  with 
the  language  only  in  being  associated  with  particular  occurrences  of  ex- 
pressions (on  which  they  comment)  and  in  having  a  punctuation  convention 
that  allows  them  to  be  unambiguously  distinguished  from  expressions  in 
the  language. 

1    In  the  book  we  use  italics. 

E.\AMPLE    This  is  an  example  of  a  comment;  it  maij  appear  anywhere. 


2.  Metanotation 

2.1  The  language  itself  is  described  by  giving  various  classes  of  expres- 
sions and  assigning  meanings  to  the  members  of  these  classes  (i.e.,  telling 
what  they  designate).  We  will  generally  do  this  in  English  but  with  a  few 
special  notations. 

2.2  Expression-variables 

1  Let  a,  b, .  .  .  ,  A,  B, .  .  .  be  variables  whose  domain  is  a  set  of  ex- 
pressions. 

2  Let  class(a)  be  the  set  of  definite  expressions  defined  by  the  indefi- 
nite expression  a.  This  is  extended  to  definite  expressions,  x,  by 
defining  class(x)  =  x. 

COMMENT  Normally  lowercase  variables  (e.g.,  a)  stand  for  any 
legal  expression,  whereas  uppercase  variables  (e.g..  A)  stand  for 
any  indefinite  expression. 

2.3  We  will  define  the  language  by  giving  forms  of  expressions,  that  is, 
by  writing  down  sequences  of  expressions  and  expression-variables.  These 
forms  are  to  be  interpreted  as  permitting  any  expression  that  results  from 
replacing  the  expression-variables  with  expressions  from  their  respective 
domains. 

EXAMPLE  If  the  form  x|y  is  legal,  where  x  and  y  range  over  components, 
then  the  expression  M  |  P  is  legal. 

2.4  The  one  special  notation  is  the  expression  form 

X  O  X  .  .  . 

which  is  to  be  taken  as  permitting  an  indefinite  sequence  of  x's  separated 
by  O  S,  terminating  with  an  x,  where  each  occurrence  is  to  be  viewed  as 
an  independent  variable.  That  is,  x  o  x  .  .  .  is  equivalent  to 

X 

or 

X  O  X 

or 

X  O  X  O  X 

or 

X  O  X  O  X  O  X 

etc. 

EXAMPLE  d  a'd  ....  where  d  ranges  over  digits  and  o  over  arithmetic 
operations,  could  have  as  instances;  5,  6  -|-  6,  7  —  2  -i-  3,  etc. 


Appendix  609 


COMMENT  Note  that  we  have  used  the  same  variable  several  times,  even 
though  independently  selected  values  are  meant  at  each  occurrence.  It  will 
always  be  clear  from  the  context  when  this  is  being  done. 


3.  Basic  syntax 

3.1  .\n  expression  is  either  a  name  or  a  sequence  of  expressions. 

3.2  ,\  name  is  a  sequence  of  characters  written  without  spaces. 

3.3  A  character  is  a  member  of  one  of  the  following  alphabets: 

1  Capital  letters       .\  B  .  .  .  Z 

2  Small  letters  a  b  .  .  .  z 

3  Digits  0  1  ...  9 

4  Marks  |  ; , :  «- ^  =  ^  ®  D  V  A -,  =  ^  <  > 

<>?+-X/~TiC„.-$#""0 
f  M(  )[  ]  {  X  > 

The  characters  of  each  alphabet  are  ordered  as  shown,  from  left  (low  I  to 
right  (high). 

3.4  One  or  more  spaces  (freely  determined)  occur  between  names.  The 
only  exceptions  are  names  that  are  single  marks  (alphabet  4,  above)  and 
can  be  disambiguated.  For  these,  spaces  can  be  omitted. 


E.XAMPLES    .\,  B  instead  of  .\  ,  b  j 
—  3  instead  of  —  3 
(A  +  B)  instead  of  (  A  +  B  ) 


f  . 

r  I  * 


3.5  Parentheses  are  used  around  any  expression  that  would  otherwise  be 
ambiguously  interpreted.  Con\ersely,  parentheses  can  be  dropped  when- 
ever there  is  no  possibility  of  ambiguity. 

3.6  To  avoid  excess  parentheses,  an  order  of  precedence  exists  for  names 
used  as  separators.  The  higher  in  the  order,  the  greater  the  binding  power, 
i.e.,  the  greater  precedence  in  being  interpreted  first.  The  following  order 
is  consistent  with  the  alphabetical  order: 

:=     I     I     ;     I     .     I     :     I     ^     I     ^     I  I  I 

V|A|^|=^|<><>|-I--|X/ 
I     —     I     t     I     1     I     3     I     /(abbreviation)^ .  -  (hyphen) 

3.7  Spacing  on  the  page  is  freely  determined  (e.g.,  for  legibility).  .\n  ex- 
pression may  nm  freely  on  several  consecutive  lines  (with  no  explicit  con- 
tinuation mark). 


3.8  Subscripting  and  superscripting  may  be  used  interchangeably  with 
the  marks  I  and  1  respectively. 

E.\AMPLE    10  i  2  is  the  same  as  10., 
X  I  2  is  the  same  as  x~ 


4.    Commands:  assignment,  abbreviation,  variables,  forms 

4.1  If  X  is  a  free  name  (as  defined  in  General  Conventions  section  10 
(GC  10)]  and  y  is  anv  expression,  then  the  command 

X  :=  y 

assigns  the  name  x  to  the  corresponding  expression  y.  In  particular, 
class(x)  =  class(y) 

E.\.\.MPLE  BILL  :=  C( operation-rate:  10  1  6  o/s)  assigns  a  name  to  a  par- 
ticular ( partially  specified)  computer. 

4.2  If  there  are  several  assignment  expressions  for  a  single  name  x: 

X  :=  a 
X  :=  b 

etc.;  then  x  is  assigned  to  be  the  name  of  the  union  of  all  the  expressions: 

class(x)  =  union(  class(i)  I 
i  =  a,b, .  .  . 

E.v\MPLE  .\1.1  :=  .\l(size:  1000  w)  and  M.l  :=  Misize:  2000  w)  would 
define  M.l  to  be  memories  of  either  1,000  or  2,000  words. 

4. .3    If  X  is  any  name  and  y  is  anv  name,  then  the  command 
X  ,/  y 

assigns  y  to  be  an  abbreviation  (a  synonym)  for  x.  Abbreviation  may  oc- 
cur on  any  occasion  and  not  just  when  x  is  first  defined.  It  may  occur  as  a 
separate  expression  or  it  may  occur  in  an  expression  in  which  x  occurs, 
thus  establishing  the  abbreviation  in  passing.  A  sequence  of  abbreviations 
may  be  defined  in  the  same  expression. 

COMMENT  The  abbreviation  may  not  be  a  shorter  phrase  at  all,  but  simply 
an  alternative  phrasing  isay,  one  commonh-  known). 

E.\AMPLE    Memory  /  M,  bit  /  b,  second    sec  /  s 
multiplex  /  many  channeled 

coM.ME.\T      is  also  used  for  division,  but  no  difficulties  arise. 


E,\.\MPLE    z\0:ll)  :=  (— |ib— >z";  Tim  ISP  expre.'ision  and  also 

ih—*  M[z"])       this  comment  are  on  two  lines. 


4.4    If  X  is  any  name  and  D  is  any  indefinite  expression,  then  the  command 
X  :  =  D-variable 


610  Appendix 


assigns  x  to  be  a  variable  with  the  set  of  entities  of  class(D)  as  the  do- 
main. If  there  are  no  restrictions  on  the  domain  of  the  variable,  then  the 
D  may  be  dropped. 

E.\AMPLES    X  :  ~  number-variable 

y  :  =  component-variable 

z  :  =  variable       no  restricted  domain 

COMMENT  Note  that  these  variables  are  over  entities,  not  over  expres- 
sions (as  are  the  expression-variables  x,  y,  z). 

4.5  A  form  is  any  expression  containing  variables.  If  f  is  a  form  contain- 
ing a  single  free  name  x  (in  addition  to  variables  and  defined  subexpres- 
sions) and  g  is  a  form,  then  we  extend  the  assignment  command  to 
include 

f:=g 

which  is  taken  as  defining  the  name  x.  The  variables  occurring  in  f  are 
called  the  operands  of  x.  An  occurrence  of  the  form  f  with  variables  re- 
placed by  expressions  designating  in  the  domain  of  the  variables  is  equiva- 
lent to  the  expression  g  with  these  same  variables  replaced  by  their  values 
from  the  occurrence  of  f.  This  permits  the  definitions  of  fimctions  and 
operations  in  which  the  opeiands  (the  variables  in  f)  can  be  identified  by 
the  form  of  their  occurrence. 

EXAMPLES    X  :  =  number-variable       y  ;  =  number-variable 
X  —  y  is  a  form 
abs(x)  is  a  form 

abs(x)  :  =  (x  >  0      x;  x  <  0  — >  —  x)        defines  al>s{x) 
X  —  y  :  =  max(x  —  y,  0)       defines  x  —  ij 

5.    Indefinite  expressions 

5.1  An  indefinite  expression  is  characterized  completely  by  giving  the 
class  associated  with  the  expression. 

5.2  The  basic  evaluation  rule  is  the  following; 

If  A  contains  an  occurrence  of  another  indefinite  expression  B,  then 
class(A)  is  the  union  of  the  classes  of  all  the  expressions  formed  by  replac- 
ing the  occurrence  of  B  by  each  member  of  class(B).  In  symbols, 

class(  A(.  .  .  B  .  .  .) )  =  union(  class(  A(.  .  .  b  .  .  .) ) ) 
b  in  class(B) 

EXAMPLE    X:  =  M(size:  1000  w) 
Y  :=  C(Mp:  X) 


class(Y)  contains    C(Mp:  M(size:  1000  w;  width:  12  b)) 
C(Mp:  M(size;  1000  w;  width:  16  b)) 
C(Mp:  M(size:   1000  w;  speed:  1000  o/s)) 
etc. 

5.3    Indefinite  expressions  can  be  formed  in  five  wavs; 

1  Postulation:  an  expression  is  given  in  the  initial  definition  in  this 
appendix. 

E.XAMPLE    Entity  is  so  defined  in  GC  7. 

2  Specialization:  If  A  contains  an  occurrence  of  another  indefinite 
expression,  B  and  x  is  any  expression  for  a  subset  of  class(B);  then 
the  expression  formed  by  replacing  the  occurrence  of  B  in  A  by  x 
yields  a  legitimate  expression.  In  symbols,  if  A(.  .  .  B  .  .  .)  is  legal 
and  X  is  legal  and  class(x)  C  class(B),  then  A(.  .  .  x  .  .  .)  is  legal. 

EXAMPLE  In  the  example  of  GC  5.2,  the  expressions  of  the  mem- 
bers of  class(Y)  are  legal  expressions. 

.3  Alternation:  If  x,  y, .  .  .  are  any  expressions,  then  x|y  .  .  .  is  the  in- 
definite expression  "either  x  or  alternatively  y  or  alternatively.  .  .  ." 
In  symbols, 

class(x  I  y  .  .  .)  =  union(  class(i)  ) 
i  =  x,y, .  .  . 

COMMENT    Note  that  x  :=  a  and  x  :  =  b  is  equivalent  to  x  :  =  a|b. 

EXAMPLE    number-name  :  =  integer  |  decimal 

4  Range:  If  x  and  y  designate  members  of  an  ordering,  such  that 
X  <  y,  then 

X  -  y 

is  the  indefinite  expression  containing  all  members  of  the  ordering 
starting  with  x,  up  to  and  including  y. 

E.v\MPLE    7  ^  U  is  equivalent  to  7 1 8 1 9 1 10|  11 

5  Approximation:  If  x  designates  a  member  of  an  ordering,  then  ^x 
is  an  indefinite  expression  containing  x  plus  members  of  the  order 
on  both  sides  of  x,  without  specification  of  the  exact  limits. 

EXAMPLE    ^10  is  a  set  of  numbers  around  10,  possibly  .S|0|  10 1 11. 

COMMENT  In  the  above  five  ways  of  defining  indefinite  expressions,  spe- 
cialization and  alternation  correspond  to  the  usual  definition  of  a  simple- 
phrase  structure  grammar  (Backus  Normal  Form,  BNF);  BNF  is  often  used 
to  define  programming  languages. 


Appendix  611 


6.    Lists  and  sets 

6.  /    If  X  is  any  expression,  then 

x-list 

is  an  abbreviation  either  for 

X,  X  .  .  . 
or  for 

X;  X  .  .  . 

x-Hst  designates  an  ordered  set  of  entities  designated  by  x,  with  repetition 
permitted.  The  choice  of  a  comma  or  a  semicolon  for  the  separator  is 
semantically  irrelevant.  The  two  choices  permit  the  nesting  of  comma 
lists  within  semicolon  lists  without  parentheses.  (Recall  the  order  of  prec- 
edence of  comma  over  semicolon.) 

EXAMPLE    4,  6,  .3,  6,  9  is  an  instance  of  digit-list 

(3;  2,  5;  6;  4,  3.  8;  7)  =  (3,  (2,  5),  6,  (4,  .3,  8),  7) 

6.2    If  X  is  any  expression,  then 
x-set 

is  an  abbreviation  either  for 

X,  X  .  .  . 
or  for 

X;  X  .  .  . 

except  that  no  repetition  is  permitted,  x-set  designates  an  miordered  set 
of  entities  designated  by  x.  The  choice  of  comma  or  semicolon  is  seman- 
tically irrelevant,  as  above. 

E.\AMPLE    (3,  6,  2)  and  (2,  3,  fi)  are  the  same  entity,  as  instances  of 
digit-set. 

(3,  3)  is  not  an  instance  of  digit-set. 

7.  Definite  expressions 

7.1  .Ml  definite  expressions  can  be  defined  by  specialization  of  the  in- 
definite expression  entity.  In  the  following,  all  names  are  legitimate,  as 
defined  in  GC  10.  .\lso,  any  expression  that  occurs  without  expression- 
variables  in  it  is  a  legal  expression  of  the  language  as  it  stands. 

7.2  entity  :=  (parameter-set) 

parameter  :  =  attribute:  value 
:  =  value 

if  attribute  can  be  inferred  from  value 


:=  attribute  I —|attribute 

if  value  is  binary-value 
:  =  quantity  /  entity 

if  attribute  can  be  inferred  from  entity, 
value  :  =  entity  |  ? 

binary-value  :=  boolean |(1 10) | (on | off)] (high | low) | 

(exist  I  not-exist)  |  ( -I-  |  — )  |  (positive  |  negative) 

An  entity  may  be  defined  (or  described)  by  listing  its  attributes  and  values 
explicitly.  There  is  no  natural  ordering  on  the  attributes,  so  they  form  a 
parameter-set  rather  than  a  parameter-list.  The  value  may  be  any  entity, 
but  for  each  attribute  there  will  be  a  domain  of  possible  entities.  This 
domain  can  always  be  given  as  an  indefinite  expression.  The  question  mark 
can  be  used  when  the  value  is  uncertain.  A  parameter  always  defines  both 
an  attribute  and  a  value  but  may  be  abbreviated  in  several  ways  if  the 
context  makes  clear  what  the  attributes  and  values  are. 

1  Both  the  attribute  and  value  may  be  given  explicitly 
E.XAMPLE    .\l(size;  100  wl 

2  The  attribute  may  be  dropped,  if  the  value  uniquely  determines 
the  attribute. 

EX.\.\iPLE  M(1()00  w)  is  legal  because  the  only  attribute  of  a  mem- 
ory that  has  a  number  of  words  as  value  is  size. 

COMMENT  What  is  inferable  is  somewhat  ill-defined,  because  it 
depends  on  the  information  available  to  the  reader  of  the  expres- 
sion (whether  man  or  machine).  The  simplest  case  is  when  the 
value  is  a  quantity  whose  unit  is  uniquely  associated  with  the  attri- 
bute, as  in  the  example  above,  .\nother  is  when  the  value  is  a 
member  of  a  class  (or  a  subset  of  that  class)  and  the  attril)ute  is  the 
class  name  (see  GC  8.5). 

3  Binary-valued  attributes  may  drop  the  value  and  use  the  occurrence 
of  the  attribute  to  symbolize  the  negative  sense  and  the  negated 
attribute  to  symbolize  the  negative  sense. 

E.XAMPLE    M(destnictive„read)  for  M(destructive^read:  yes) 
M(— |destructive„read)  for  M(destructive„read:  no) 

4  If  the  parameter  gives  some  kind  of  unit  quantity,  then  it  is  often 
natural  to  state  the  parameter  in  the  form  of  quantit)'  per  entity 
(quantity  /  entity),  where  the  attribute  either  is  the  attribute  itself 
(the  unit  to  be  defined)  or  permits  inference  of  the  attribute. 

EX.\MPLE    Memory(word:  32  bits)  =  Memory(  32  bits/word) 

Control(number„devices„controlled:  3)  =  Control(  3 
devices  /  control) 


612  Appendix 


COMMENT    The  remark  made  in  point  2  above  on  "inferable"  holds 
here  as  well. 

7.3  entity  :  =  attribute(entity) 

An  entity  can  be  designated  as  the  value  of  an  attribute  of  some  other  entity. 
COMMENT    This  is  simply  standard  functional  notation. 
EXAMPLE    Pc(speed:  speed(Mp)) 

7.4  entity  :=  A(parameter-set) 

An  entity  can  be  defined  as  having  all  the  parameters  of  the  indefinite 
expression  A,  further  specialized,  modified,  or  augmented  l)V  the  given 
parameter-set. 

COMMENT  This  permits  one  entity  to  be  defined  as  an  instance  or  further 
specification  of  another  "general"  entity,  allowing  the  equivalent  of  sub- 
routining  in  building  up  a  system  of  definitions.  It  also  permits  one  entitv 
to  be  defined  as  like  another  except  in  certain  specified  respects. 

EX.\MPL.E    Let  M  ;  =  Coniponent(size:  -t-  integer  word;  color;  blue) 

M(size:  100  ^  1000  word)  further  specification 

M(size;  100;  o-rate:  10  s/word)    further  .specification,  if 
Component  defines  o-rate 

M(color:  red)  definition  by  exception 

M(size:  100;  weight:  300  lb)        definition  by  augmentation 

7.5  entity  :=  entity-set  |  entity-list  |  labeled-entity-set  |  labeled-entity-hst 
labeled-entity  :  =  label:  entity 

label  ;  =  simple-name 

An  entitv  can  be  a  set  or  a  list  of  entities.  It  is  possible  to  affix  labels  to 
the  entities  of  a  set  or  list  to  make  referencing  easier. 

E.X.4MPLE    C(M:  Mp,  Ms,  M.ps)  declares  the  memory  of  C 

T(co-components:  to:  L.l.  from:  L.2)    to  and  from  are  labels 

7.6  entity  :=  -|- integer  entity 

An  abbreviation  for  a  list  of  a  specified  number  (the  -I- integer)  of  entities, 
as  specified  in  the  entity  following  the  -(-integer.  If  the  specifying  entitv 
is  an  indefinite  expression,  then  each  of  the  entities  is  independent. 

E.X.AMPLE  12  M(tape)  where  each  M(tape)  may  have  different  further 
specifications. 

7.7  entity  :=  number | quantity] predicate | entity-name 
Each  of  these  possibilities  is  taken  up  in  later  sections. 


8.  Attributes 

8.1  The  following  gives  the  possibilities  for  attributes.  It  also  provides  for 
the  automatic  definition  of  certain  attributes.  Throughout,  let  x  be  the 
entity  whose  attribute  is  being  defined  and  let  V  be  the  domain  of  values 
of  the  attribute. 

8.2  attribute  :  =  simple-name 

Simple-names  provide  freely  definable  attributes,  without  restriction  on  use. 

E.x.\MPLE  C(user„ efficiency;  fraction)  an  attribute  called  user_efficiency 
can  simply  be  defined  and  given  any  domain  desired. 

8.3  attribute  :  =  label 

if  X  is  a  labeled-list  or  labeled-set 

The  labels  of  a  labeled-list  or  labeled-set  automaticallv  become  attributes. 

8.4  attribute  :=  V 

Often  there  exists  no  separate  name  for  an  attribute  other  than  the  set  of 
values  it  can  take  on  (V),  which  already  has  an  appropriate  expression  in 
the  language. 

EX.\MPij;  C(Mp:  M(1000  w;  32  b/w))  where  Mp  serves  as  the  attribute, 
being  also  the  domain. 

S..5    attribute  :  =  attribute:  attribute  .  .  . 

.\  sequence  of  attributes,  interpreted  as  making  an  iterated  sequence  of 
selections,  can  serve  as  a  single  attribute.  The  first  (leftmost)  attribute 
determines  a  value  of  x;  the  next  attribute  determines  a  value  in  the 
parameter  set  of  this  value,  and  so  on  through  the  sequence. 
In  svmbols: 

a:  b:  .  .  .  q(x)  =  q(p(.  .  .  b(a(x))  .  .  .)) 

EX.\MPLE    X  :  =  C(Mp(size:  1000  w)) 
size:  Mp(X)  =  1000  w 

8.6  attribute  ;  =  a 

if  q:  p:  ...  b:  a  is  an  attribute  of  x  and  there  is  only 
one  value  of  x  to  any  depth  with  attribute  a 

The  front  end  of  an  attribute  sequence  can  be  dropped  if  the  remainder 
unif)uely  identifies  the  value;  that  is,  if  there  is  only  one  occurrence  of  a 
within  X  and  its  values. 

EXAMPLE    X  :=  C(Pc,  Mp,  Ms) 

add-time(X)  is  defined,  since  only  Pc  has  an  add-time. 
size(X)  is  not  defined,  since  both  Mp  and  Ms  have  size  as  an 
attribute. 

8.7  attribute  :=  attribute-list 

The  value  is  a  value-list  that  corresponds  one-to-one  with  the  attributes  of 


Appendix  613 


the  attribute-list.  This  is  an  abbreviation  technique  that  permits  writing 
the  attribute  names  only  once  for  a  list  of  values,  each  of  which  has 
several  subattributes. 

EX..\MPLE    operation-times  :=  (add-time.  store-time)  has  values 
(10  IJ.S.  6  us),  (20  lis.  20  us),  etc. 

8.S    attribute  :=  x-name 

This  is  a  single  special  attribute,  defined  for  each  entity  x.  See  CC  10.10 
for  definition. 

8.9    attribute  :=  index  ,  # 

where  value( index)  :=  -f- integer]  —  integer 

if  X  is  a  list  (more  generally,  of  form  z  o  z  .  .  .) 

The  elements  of  a  list  (or  other  se<)iience)  are  automatically  indexed  b\' 
their  number  from  the  front  ( -I- integer)  or  the  end  (  —  integer)  of  the  list. 
This  index  can  be  used  as  an  attribute. 

E.VAMPLE    x:  =  (Ma,  Mb,  Mc,  Md) 

x(index:  .3)  =  .x(#:  .3)  =  x(3)  =  Mc 
X.4  =  X.  -  1  =  Md 

9.  Null  symbol  and  optional  expression 

9.1  Let  0  be  the  null  expression 

class(0)  =  the  null  class 

0  ma\'  occiu  as  the  defining  expression  in  an  assignment  or  as  a  member 
of  an  alternation: 

X  :  =  f) 

x|0|y 

0  may  occur  as  a  member  of  a  set  or  list,  in  which  case  it  ma\'  be  deleted 
from  the  set  or  list. 

X,  0,  y  is  equivalent  to  x,  y 

9.2  If  X  is  any  expression,  define  the  optional  expression 
'x  to  be  (X 1 0) 

Thus,  if  'x  occurs  in  any  expression,  it  means  that  either  x  can  occur  there 
or  0,  that  is,  x  has  an  optional  occurrence. 

EX.^MPLE    (1,  *2,  .3,  '4)  =  (1,  2,  3,  4)1(1,  .3,  4)|(1,  2,  3)|(1,  3) 

10.  Names 

10.1    Names  are  expressions  distinguished  by  two  things: 

1  They  are  composed  of  strings  of  characters,  which  are  not  them- 
selves expressions. 

2  They  are  written  without  spaces  between  the  characters. 


10.2  There  is  a  special  class  of  expressions  called  name-expressions,  which 
are  used  to  define  names. 

1  Name-expressions  all  have  names  that  are  of  the  form  x-name, 
where  x  is  a  name. 

2  Name-expressions  are  written  with  spaces,  which  are  to  be  removed 
in  generating  strings  of  characters  from  them. 

3  Name-expressions  occur  only  in  conjunction  with  name-expression 
names,  either  as  an  assignment: 

x-nanie  :  =  name-expression 

or  as  an  attribute-value: 

x-name:  name-expression 

Thus,  it  can  alwass  be  determined  when  a  name-expression 
occurs. 

E.x.\.\iPLE    Q-name  :=  .\  B  (1 12)    defines  Q-name 

.■\B1  and  AB2  are  the  two  possible  Q-names. 

10..3    .\lphabets  are  defined  as  the  alternates  of  their  characters,  e.g., 

digit:=  0|1|2|3|4|5|6|7|8|9 

Capita!  letters,  small  letters,  marks,  and  characters,  as  laid  out  in  GC  3.3, 
are  defined  similarly. 

10.4    If  X  is  an)  set  of  characters,  then 
x-string 

is  a  string  of  such  characters  of  indefinite  length  (at  least  one)  with  no 
spaces  between. 

E.X-\MPLE    digit-string  contains  1,  1.3.54,  6.5487,  etc. 

COMMENT  Note  that  expression-variables  are  being  extended  to  cover  sets 
of  characters  and  character  strings,  even  though  these  are  not  always 
expressions. 

10. .5    name  :=  simple-name  j  compound-name  |  number-name  |  x-name 

10.6    simple-name  :=  primitive-name  i  phrase-name  [  hyphen-name 
primitive-name  :  =  (capital-letter  |  small-letter  |  digitj-string 
phrase-name  ;=  primitive-name_primitive-name  .  .  . 
hyphen-name  :  =  phrase-name-phrase-name  .  .  . 

Single-names  are  strings  of  letters  and  digits  or  phrases  made  up  of  such 
strings  with  space  concatenation  marks  („)  (phrase-names)  or  with  hyphens 


614  Appendix 


(-)  (hyphen-names).  All  simple-names  fimction  identically:  they  obtain  their 
designations  through  assignment  (:  =  )  or  abbreviation  {/).  They  may  thus 
be  definite  or  indefinite,  corresponding  to  the  expressions  thev  name.  Any 
simple-name  may  be  used  if  it  has  not  already  been  used  for  a  different 
expression  or  is  not  excluded  by  number-name  or  by  a  previously  defined 
x-name  (see  below). 

EXAMPLES    AB.3    SAM    Baker    Instniction„set    input-register  13-B 

ABBREVIATION  If  there  is  no  chance  for  ambiguity,  phrase-names  may  be 
written  with  a  space  instead  of  the  space-concatenation  mark  („). 

E.XAMPLE    skip  condition  =  skip„condition 

ABBREVIATION  If  the  hvphcn-name  x-a  is  used  within  the  scope  of  the 
definition  of  the  entity  x,  then  the  name  may  be  abbreviated  to  just  a. 

COMMENT  This  permits  the  use  of  the  same  name  in  local  contexts,  where 
the  name  of  the  context  (the  expression  being  defined)  serves  to  disambig- 
uate the  name  where  needed. 

EXAMPLE    data-type  :=(...  data-type-component:  data-type  .  .  .) 

data-type  :=(...  component:  data-type  .  .  .)  alternative  form 

10.7    compound-name  :=  S  .  v  .  v  .  .  . 

where  S  is  an  indefinite  simple-name  and  the 

V  are  .simple-names. 
The  compound-name  has  the  same  designation  as 

S(v;  V  .  .  .) 

where  each  of  the  v's  defines  a  parameter  whose  attribute  may  be  dropped 
because  the  v  is  self-identifying.  Thus  a  compound-name  is  an  abbrevia- 
tion technique  that  constructs  a  name  for  an  entity  by  conjoining  a  series 
of  modifying  attribute  values  to  the  type  of  the  entity. 

EXAMPLE    Memory. primary    is  an  abbreviation  for 
Memory(function:  primary) 

ABBREVIATION  An  intervening  period  may  be  dropped  if  no  ambiguity 
results. 

EXAMPLE    Mp  is  the  same  as  M.p 

Mprimary  is  the  same  as  M. primary       thougli  poor  taste 

COMMENT  Compound  names  have  the  desirable  feature  that  the  leading 
symbol  (leftmost)  gives  the  kind  of  entity  being  designated,  e.g.,  M. primary 
is  a  kind  of  memory. 

10. H    number-name.  Defined  in  GC  11. 


10.9  x-name.  The  names  to  be  used  in  defining  an  immediate  instance 
of  the  entity  x.  If  x  is  any  entity  and  y  is  any  name-expression,  such  that 

X  :  =  (x-name:  y;  .  .  .) 

then  any  z  which  is  an  instance  of  x, 

z:  =  x(  ) 

must  be  chosen  from  the  name-expressions  defined  by  y.  This  holds  only 
for  a  single  level.  If  w  :  =  z(.  .  .),  then  w  is  not  constrained  as  to  the  name 
used. 

EX,-iMPLE    component  :  =  (component-name:  capital-letter) 
M  :  =  component  (.  .  .)    is  legal; 
SAM  :=  component(.  .  .)    is  not  legal; 
SAM  :=  M(.  .  .)    is  legal. 

11.  Numbers 

11.1    nimiber  :  =  number-name  |  number-variable  |  number  J.  base  | 
arithmetic-expression  |  count-expression 

number-name  :  =  integer  |  decimal 

integer-name  /  integer :  =  'sign  digit-string 
recall  *  rtieam  optional 

sign  :=  -I-  I  - 

-I-  integer-name  /  -I-  integer  :  =  digit-string       includes  0 
—  integer-name  /  —  integer  ;  =  —  -|-  integer 
decimal-name  /  decimal ;  =  integer  .  digit-string 
base  :  =  + integer 

arithmetic-expression  :  =  unary-arithmetic-operation  number  | 
number  binary-arithmelic-operation  number  | 
number  n-ary-arithmetic-operation  number  .  .  .  | 
arithmetic-fimction(number-list) 

imary-arithmetic-operation  :=  —  |  -f 

binary-arithmetic-operation  :  =  —  |  /  |  exponentiation  /  exp  /  f  I 
modulo  /  mod 

n-ary-arithnietic-operation  :  =  -|-  |  X 

arithmetic-function-operation  :  =  log  I  2 1  absolute-value  /  abs  | 
entier  |  maximum  /  max  |  minimum  /  min  |  average  /  avg  |  sum  | 
product  /  prod 

count-expression  :  =  nuniber(x-set)  |  number(x-list) 

Numbers  are  defined  in  the  standard  way,  starting  with  number-names 
for  integers  (1324  or  -  14)  and  decimals  (13.23).  If  the  base  of  the  number 


Appendix  615 


system  is  different  from  10,  it  may  be  given  explicitly  (for  example,  10  I  2 
=  lOo  =  2).  Arithmetic  expressions  are  formed  from  various  arithmetic 
operations  with  numbers  as  operands.  Operations  are  classified  by  their 
syntactic  form:  unary  operations  (  —(3)  or  +(7) );  binary  operations  (7  —  6, 
3/8  or  3  t  2  =  3=);  and  n-ary  operations  (3  +  8  +  6  or  5  X  6  x  2  x  3). 
Functions  are  defined  as  taking  a  list  of  numbers  as  operands  (abs(3)  or 
max(5,  7,  — 12) ).  There  is  a  comiting  function  that  takes  any  set  or  list  of 
entities  as  inputs  and  produces  their  number  (if  X  :  =  (Ma,  Mb,  Mc)  then 
number(X)  =  3).  Abbreviations  are  introduced  for  many  of  the  operations 
and  functions. 

11.2    number-set-name  :  =  (digit  |  Oj-string 

A  special  subset  of  (alternative)  numbers  may  be  defined  by  substituting  a 
<!>  for  a  digit.  The     stands  for  any  digit  (of  the  base  of  the  number). 

EXAMPLE    Ol*  =  010|01]  (m>  hiiumj 

TP  =    70 1 71 1 77         7<\<  octal 

12.    Quantities,  dimensions,  and  units 

quantity  :  =  number  imit 

unit  :=  (dimension;  conversion-list) | unit-name  :=  multipher  unit] 
simple-name 

conversion  :  =  munber-name  unit  |  number-name  /  unit  | 
arithmetic-expression(unit) 

multiplier  ;=  pico    p  :  =  10'-|nano  /  n  :=  10**! 

micro  /  fi  /  u  :  =  10^''  |  milli    m  :  =  10-'  |  centi  /  c  :  =  10- 1 

kilo/k:  =  (103 1 2"')  I 

mega  :  =  10**  |  giga  /  g  :  =  10'' 

dimension  :=  (base-unit:  unit)  |  [dimension-expression] 

dimension-expression  :  =  dimension  |  dimension  X  dimension  | 
dimension  /  dimension 

A  quantity  is  a  number  of  units  of  a  given  dimension.  A  imit  is  defined 
by  the  dimension  and  the  conversion  between  the  given  unit  and  other 
units  of  the  same  dimension.  Conversions  can  be  expressed  either  as  the 
amount  of  the  other  unit  for  each  of  the  given  units  (e.g.,  1  minute  is  60 
seconds)  or  as  the  amount  of  the  given  unit  per  each  of  the  other  units 
(e.g.,  1  minute  is  1/60  per  second  =  .0167  /  second).  When  conversions 
are  not  linear,  it  is  necessary  to  use  functions  of  the  other  imit.  Thus,  for 
bits  the  conversion  to  states  is  logolstates)  (e.g.,  128  states  is  equivalent  to 
logj(128)  =  7  bits). 

Each  dimension  has  a  base  unit  (e.g.,  seconds  for  the  dimension  of  time). 
A  dimension  may  also  be  given  as  a  product  of  two  other  dimensions  (e.g., 
[energy]  is  [force  X  distance])  or  the  ration  of  two  other  dimensions  (e.g., 
[velocity]  is  [length  /  time]).  We  use  the  standard  bracket  notation  to  indi- 
cate dimension,  (e.g.,  [1/t]  for  the  dimension  of  velocity). 


13.    Boolean  and  relations 

boolean  :  =  true  /  t/1 1  false  /  f/0 1  boolean-variable  1 
boolean -expression  |  relational-expression 

boolean-expression  :=  unary-boolean-operation  boolean] 
boolean  binary-boolean-operation  boolean  | 
boolean  n-ary-boolean-operation  boolean  .  .  . 

unary-boolean-operation  :  =  — | 

binary-boolean-operation  :  =  D  |  = 

n-ary-boolean-operation  :  =  V  |  A  |  ® 

relational-expression  :  =  number  relational-operator  number 
relational-operation  :=  =  |^|<|>|<|>|  =  |^ 

There  are  tu  o  primary  boolean  values,  true  and  false.  Boolean-\  ariables, 
boolean-expressions,  and  relational-expressions  are  expressions  that  evaluate 
(potentially)  to  true  or  false.  Boolean  expressions  are  made  up  from  the 
standard  operations  on  truth  values:  negation  (— | ),  implication  (  D ),  equiva- 
lence (=),  conjunction  (A ),  disjunction  (  V),  andexclusive-or(@).  Relations 
are  defined  on  numbers. 

COMMENT  More  general  definitions  for  entities  (for  =  and  ^)  and  for 
ordered  sets  (for  <,  >,  <,  and  >)  are  not  needed. 

PMS  conventions 

Making  use  of  the  prior  general  conventions,  PMS  is  developed  systemati- 
cally through  the  definitions  of  the  various  components:  P,  M,  S,  etc.  Much 
of  the  development  repeats  common  abbreviations  and  conventions,  simply 
to  provide  a  self-contained  notational  s\stem. 


1 

Dimensions 

2 

General  units 

3 

Infonnation  units 

4 

Component 

.5 

Link  (L) 

6 

Memory  (M) 

7 

Switch  (S) 

8 

Control  (K) 

9 

Transducer  (T) 

10 

Data  (D) 

11 

Processor  (P) 

12 

Computer  (C) 

616  Appendix 


1.  Dimensions 

1.1  Definition  of  dimension,  repeated  from  GC  12. 
dimension  :=  (base-unit:  unit)  |  [dimension-expression] 

1.2  Basic  dimensions 

time  /  [t]  :=  dimension(base-unit:  second) 

length  /[!];=  dimension(base-unit:  meter) 

cost  /[$]:=  dimension(base-unit:  dollar) 

weight  :=  dimension(base-unit:  kilogram) 

power  :=  dimension{base-unit;  watt) 

temperature  :=  dimension(base-unit;  degree-centigrade) 

voltage  :=  dimension(base-unit :  volt) 

current  :=  dimension(base-unit:  ampere) 

component  /  [c]  :  =  dimension 

operation  /  [o]  :  =  dimension 

information  /  [i]  :=  dimension( base-unit:  bit) 

state  :=  dimension{base-unit:  state) 

2.  General  units 

2.1  Definition  of  unit,  repeated  from  GC  12. 

unit  :=  (dimension;  conversion-list)  |  unit-name  :=  multiplier  unit| 
simple-name 

conversion  :  =  number-name  unit  |  number-name  /  unit  | 
arithmetic-e.vpression  (imit) 

2.2  We  give  the  basic  units,  but  no  variations  with  multipliers. 

second  /  sec  /  s :  =  unit(dimension:  time) 

minute  /  min  :  =  unit(dimension:  time;  conversion:  60  s) 

meter/  m  :=  unit(  dimension:  length) 

foot  /  ft  :=  unit(dimension:  length;  conversion:  .3.28  /  meter,  12  in) 

inch  /  in  :  =  imit(dimension:  length;  conversion:  .39. .37  /  meter,  12  /  ft) 

dollar  /  $  :  =  unit(dimension:  cost) 

operation  /  o  :  =  unit(dimension:  operation) 

watt  /  w  :  =  unit( dimension:  power) 

volt  /  V  :  =  unit(dimension:  voltage) 

ampere  /  amp  /  a  :  =  unit( dimension:  current) 


kilogram  /  kg  :  =  unit(dimension:  weight;  conversion:  2.2  /  lb) 
pound  /  lb  :=  unit(dimension:  weight;  conversion:  2.2  kg) 

3.    Information  units 

3.1  Units 

state  :  =  unit(dimension:  state;  conversion:  2"  bits) 

binary-digit  /  bit  /  b  :  =  unit(dimension:  [i];  conversion:  log2(x)  states) 

octal-digit  /  od  :  =  unit(dimension:  [i);  conversion:  3  bits) 

decimal-digit  /  digit  /  d  /  dit  rare  :  =  unit(dimension:  [i];  conversion: 
log2(10)  bits,  logj„(x)  states) 

hexa-decimal-digit  /  hex  :  =  unit(dimension:  [i],  conversion:  4  bits) 

character  /  char  /  ch  :  =  unit(dimension:  [i];  conversion:  4  —  8  bits) 

byte  /  by  :=  unit(diniension:  [i];  conversion:  8  bits) 

COMMENT  The  bvte  is  almost  standardized  at  8  bits; 
occasional  use  otherwise,  although  not  in  this  book. 

3.2  1-units 

i-unit  :=  base-unit  |  length  X  i-unit  |  i-unit-name  |  (base-unit;  length- 
list;  content:  product(length-list)  base-unit;  level:mmiber(length-list)) 

i-unit-name  :  =  i-unit-prefix  i-unit-nanie  |  simple-name 

i-unit-preflx  :  =  -|-  integer  |  multiple/m  |  quadruple/q  |  triple/t  | 
double/d  j  *single/s  |  half/h  |  fractional/fr 

base-unit  :  =  unit(dimension:  [i]) 

length  :  =  -f  integer 

The  i-unit  is  a  hierarchically  organized  information  stnicture,  in  which 
each  level  consists  of  a  number  of  subunits,  all  identically  organized.  The 
number  of  subunits  in  a  level  is  called  its  length.  Units  eventually  occur 
that  cannot  be  decomposed  further.  These  are  called  base-units  and  are 
some  unit  of  information — e.g.,  the  bit  or  the  character.  Thus,  if  the 
lengths  are  Lj,  L.,,  .  .  .  ,  L„  and  the  base  unit  is  the  bit,  then  the  total 
amount  of  information  (the  content  of  the  i-unit)  is  Lj  X  L.,  X  .  .  .  X  L„ 
bits  and  the  number  of  levels  is  n.  The  i-unit  may  be  likened  to  an  n-di- 
mensional  rectangular  volume  of  information  (except  that  the  "dimensions" 
— the  lengths — occur  in  a  fixed  order). 

COMMENT  Almost  all  infonnation  in  computer  systems  is  organized  in 
terms  of  i-units — e.g.,  a  memory  consists  of  a  number  of  words,  each  of  a 
number  of  characters,  each  of  a  number  of  bits.  More  exotic  data  stmetvu-es 
are  invariablv  encoded  into  i-units  and  are  not  reflected  in  the  hardware. 


Appendix  617 


word  :=  length  X  bits  |  length  X  character  |  length  X  base-unit 

word-bit-length  :=  12  —  64 

word-character-length  :  =  2  ~-  8 

block  :  =  length  X  word  |  length  X  character 

record  :  =  length  X  word  |  length  X  character 

file  :  =  +  integer  X  block  |  -I-  integer  x  record 

IBM-card  /  card  :  =  column  X  row  x  card-hole 

card-column  /  col :  =  80 

card-row  /  row  :  =  12 

card-hold  :  =  1  bit 

print-line  /  line  :=  print-column  X  character 

print-colmiin  /  col :  =  64  -  132 1 72 1 80 1 120 1 132     rarely  <64 

4.  Component 

4.1    component  ;  =  ( 

component-name:  capital-letter; 

manufacturer-name  /  '  :  'manufacturer  catalog-number; 

operation-set; 

operation-rate-set; 

'subcomponents:  (fimction-attribute:  component)-set; 

'coconiponents:  (function-attribute:  component )-set; 
port-set; 

function:  (subcomponent-attribute  |  cocomponent-attribute); 

logic-technology; 

'technology; 

reliability:  (mean-operations-between-failure    MOBF,  mean-time- 
bet  ween-faihne  ,'  MTBF); 

error-rate:  (erroneous-operations  /  error-free  operations); 

cost:  purchase,  rental; 

lineage; 

history; 

weight; 

power; 

volume; 

area; 

temperature) 


This  single  definition  of  a  computer  component  contains  all  of  the 
attributes  common  to  all  components.  All  components  can  thus  be  given 
as  further  specifications  of  this  definition.  (Such  definitions  can  add  attri- 
butes not  in  the  higher  entity.)  Examples  are  given  in  succeeding  sections. 
We  comment  on  some  of  the  attribute  domains  below  and  provide  an 
extensive  listing  of  values  for  some. 

4.2  Component-nunif.  All  components  that  are  immediate  instances  of 
this  definition  are  to  have  single-letter  names — for  example,  P,  M,  S,  etc. 
Names  of  instances  of  P,  M,  S,  etc.,  are  arbitrary. 

4.3  Mcmiifacturer-nanu's  \  Proper-name.  We  provide  a  very  short  abbrevia- 
tion (')  to  indicate  that  a  string  of  characters  is  a  manufacturer's  name, 
since  these  names  are  arbitrary  and  need  to  be  distinguished  from  other 
values.  A  proper  name  can  also  be  given  to  a  component. 

EXAMPLES    'IBM  System/360  Model  50.  'I/0„Bus 

4.4  Operation-.set  and  operation-rate-set.  A  component  is  defined  funda- 
mentally by  the  set  of  operations  it  can  perform.  In  PMS  such  operations 
are  defined  informally  and  given  names  (e.g.,  read,  transmit).  Significant 
perfonnance  parameters  may  be  defined,  but  complete  definitions  are  given 
only  in  ISP.  Each  operation  has  a  rate  (number  per  unit  time),  which  need 
not  be  constant. 

E.\.\MPLE  .\  link  might  have  an  operation-set  consisting  of  two  transmis- 
sion operations  (one  in  each  direction)  of  a  single  i-unit.  The  operation- 
rate  might  be  10  f  3  o/s  for  each  operation.  If  the  i-unit  were  10  b,  it 
would  be  given  an  information-rate  of  10  |  4  b/  s. 

4.5  Stthcomponetit.s.  cocomponenti.  function.  In  general,  components  con- 
sist of  P.\IS  structures  of  other  components,  which  are  called  its  subcom- 
ponents. .\ho,  in  general,  a  component  participates  in  a  PMS  structure. 
The  components  to  which  it  is  connected  are  called  its  coconiponents.  The 
connecting  interface  of  a  component  and  a  cocomponent  is  called  a  port. 
Conventional  names  exist  that  describe  the  roles  the  components  play  in 
a  PMS  structure  (e.g.,  central  processor,  buffer  memory,  address  switch). 
These  terms  are  called  functions  and  can  be  used  to  label  both  subcom- 
ponents and  coconiponents. 

4.6  port:=( 

operations:  (output] input); 
operation-rate  /  o-rate; 
i-unit  :[i]; 

infonnation-rate  /  i-rate:  ((i-unit  /  operation)  X  o-rate  [i/t]); 
concurrency:  -I- integer; 

concurrency-type:  (simplex  |  half -duplex  |  full-duplex  |  time-multiplex  | 
multiplex); 


618  Appendix 


I 


direction:  (from  /  out  /  output  /  X      \  (to  /  in  /  into  /  input  /  X  <— ); 
tum-around-tinie  /  t.tuni:  [t]  only  for  half-duplex  carrier; 
carrier) 
carrier  :  =  ( 

writability:  (human  /  h  |  machine  /  mechanical  process  /  m  | 
both  machine  and  human  /  b); 

readabihty:  (human  /  h  j  machine  /  mechanical  process  /  m  | 
both  machine  and  human  /  b); 

medium; 

encoding) 

medium  :  =  (electrical  conduction  :  =  voltage  |  current)  | 

magnetic  |  electrostatic  |  radiowave  |  microwave  |  optical  light  | 
(mechanical  movement  :  =  tactile  |  linear  position  |  angidar  position  | 
spatial  position)  |  temperature  /  heat  | 

(acoustical  /  airpressure  :  =  high  frequency  audio)  |  memory  technology 
see  PMS  6.2 

encoding  /  modulation  :=  continuous-modulation  /  analog  | 

digital  /  discrete-modulation 
continuous-modulation  :  =  direct  /  null  |  amplitude  /  am  | 

pulse  amplitude  modulation  /  pam  |  pulse  duration  modulation  /  pdm  j 

time  duration  modulation  |  frequency  modulation  /  fm 
discrete-modulation  :  =  direct  /  pulse  code  modulation  /  pcm  | 

frequency  shift  keying  /  fsk  |  digital  pulse  |  digital  level  |  contact 

The  ports  are  the  connection  points  (nodes  or  terminals)  of  a  compo- 
nent at  which  cocomponents  connect.  A  port  is  not  a  component  but 
simply  an  interface  with  a  characteristic  i-unit  that  crosses  it  in  one  direc- 
tion or  the  other.  One  can  thus  associate  two  operations  with  a  port, 
namely,  the  transmission  operations  of  its  component  and  the  cocom- 
ponent.  The  port  introduces  directionality:  input  is  from  the  cocompo- 
nent  into  the  port's  component;  output  is  from  the  port  s  component  to 
the  cocomponent. 

The  i-unit  subcomponents  usually  correspond  to  physical  subparts  of 
the  port.  For  conventional  information-carrying  stiiictures,  the  base-unit 
is  the  encoding  of  information  on  a  single  wire  of  the  port,  i.e.,  a  bit. 
The  width  is  the  number  of  wires  available  per  imit  time.  The  length  is  the 
number  of  (width  X  base-unit)'s  which  are  necessary  to  transmit  the  i-unit. 
As  such,  the  i-miit  can  be  thought  of  as  a  message  nomially  with  length 
X  width  X  base-unit.  More  complex  messages  can  have  multiple  dimen- 
sional lengths  (e.g.,  consider  a  record  which  is  transmitted  serially,  where 
the  base-unit  is  a  bit,  the  width  is  1,  the  length  is  an  8-bit  byte,  and  the 
record  length  is  1,()()0  bytes). 


The  information  rate  as  measured  at  the  port  is  the  flow  of  i-units  per 
unit  of  time.  An  equivalent  measure  is  the  time  for  the  i-unit  to  pass 
through  the  port.  Concurrency  is  a  measure  of  the  number  of  simul- 
taneous i-units  the  port  can  pass.  Concurrency-type  denotes  both  the 
number  of  simultaneous  messages  and  the  message  direction.  The  simplex 
port  allows  only  one  message  to  enter  or  leave  the  port,  not  both.  The 
half-duplex  port  allows  a  message  to  either  enter  or  leave  the  port,  but 
only  on  a  time-multiplexed  basis;  that  is,  the  port  is  simplex  for  one 
direction  at  a  time.  In  the  case  of  the  half-duplex  port,  the  turnaround 
time  is  a  significant  attribute  that  denotes  the  time  taken  to  go  from  re- 
ceiving to  transmitting  or  vice  versa.  A  full-duplex  port  allows  information 
to  flow  in  both  directions  at  once  (i.e.,  enter  and  leave  the  port  simulta- 
neously). Finally,  the  multiplex  port  denotes  multiple  ports  that  can  be 
decomposed  into  the  more  elementary  stnictures  discussed  above. 

Direction  is  usually  indicated  on  each  port  of  a  component  to  denote 
the  direction  of  information  flow.  Direction  must  be  specified  for  simplex 
ports  (using  arrowheads  <— ,  — >).  Half-  and  full-duplex  ports  are  shown 
with  no  arrowheads. 

Carrier  characterizes  the  fonn  of  infonnation  at  a  port.  The  two  major 
attributes,  writability  ajid  readability,  define  whether  human  beings,  ma- 
chines, or  both  human  beings  and  machines  are  able  to  use  (interpret)  the 
carrier  directly.  Media  denotes  the  technology  of  the  carrier.  Information 
can  be  carried  by  any  of  the  media  listed.  It  should  be  noted  that  memory 
technology  is  also  listed  as  a  media  to  carry  information.  Unlike  the  media 
that  are  instantaneous  carriers,  memory  holds  information  over  a  long  pe- 
riod of  time.  For  each  media,  it  is  appropriate  to  encode  information  in 
particular  ways.  The  two  basic  methods  are  continuous  and  discrete  en- 
coding (or  modulation). 

4.7  Logic-technology  and  technology.  All  devices  have  a  logic  technology 
and  almost  always  only  a  single  one  (though  exceptions  exist,  especially  in 
compoimd  components).  They  may  also  have  other  technology  specific  to 
the  type  of  component  (e.g.,  disk-memory  technology).  The  logic  technol- 
ogy is  given  here;  other  technologies  are  given  with  the  specific  component. 

logic-technology  :  =  magnetic-core  |  cryogenic  | 
electro-mechanical  |  fluidic  |  hybrid-circuit  | 

monolithic  integrated  /  integrated  /  ic| large  scale  integrated  /  LSI] 
mechanical  |  integrated  metal  oxide  silicon  /  MOS  | 
medium  scale  integrated  /  MSI |  optical] 
transistor  j  vacuum-tube 

4.8  Reliability.  .Although  of  extreme  importance,  we  list  only  two  values 
for  reliability,  the  mean  number  of  operations  between  failures,  and  the 
mean  time  between  failures.  In  essence,  one  can  be  derived  from  the  other 
if  the  operation  rate  is  known. 


Appendix  619 


4.9  Error  rate.  Usually  a  ratio  of  the  number  of  erroneous  operations  per 
error-free  operations.  Approximately  l/(probability  of  an  error). 

4.10  Cost.  Only  the  two  simplest  cost  numbers,  purchase  price  and 
(monthly)  rental  are  listed  as  attributes.  Conventionally,  purchase  price  is 
taken  as  45  times  monthly  rental.  In  addition,  one  could  list  manufac- 
turing costs,  broken  down  into  materials,  labor,  etc.,  and  more  elaborate 
sales  costs,  such  as  lease-purchase  options.  Most  of  these  quantities  are  not 
relevant  from  an  engineering  viewpoint.  Some  that  are  important  are  un- 
obtainable in  general. 

4.11  lineage  :=  ( 
manufacturer:  Burroughs  | 

Control  Data  Corporation  /  CDC  | 
Digital  Equipment  Corporation  /  DEC  | 
Enghsh  Electric] 
Ferranti  | 

General  Electric  /  GE  | 
Honeywell  | 

International  Business  iMachines    IBM  | 
International  Computers  and  Tabulators  /  ICT| 
Hewlett-Packard /HP  I 
Olivetti  I 

Radio  Corporation  of  .\merica  /  RCA| 
Remington-Rand  /  UNIVAC  | 

Scientific  Data  Systems    SDS  /  Xero.x  Data  Systems    XDS  ] 

Westinghouse; 

manufacturer-type:  government  /  g  |  industrial  /  i  | 
research-laboratory  /  r  |  university  u: 

country:  .Australia  /  A  |  Great  Britain  /  B  |  Canada  /  C  |  Denmark    D  | 
France  /  F  |  Germany  /  G  |  Israel  /  H  |  Italy  /  1 1  Japan  /  J  | 
Netherlands    N  |  Russia  /  R  |  Sweden  /  S  |  United  States  *U; 

'descendants:  component-set; 
'antecedent:  component-set) 

The  attributes  are  mostly  self-descriptive.  We  have  not  attempted  to 
list  manufacturers  other  than  the  principle  industrial  ones.  Descendants 
and  antecedents  are  necessarily  vague,  since  no  precise  notion  of  parent- 
hood can  be  defined.  It  is  not  limited  to  computers  built  as  a  series  (as  in 
the  IBM  704  being  a  descendant  of  the  IBM  701)  but  includes  any  ma- 
chine where  the  design  bond  is  strong  (e.g.,  IBM  709  and  7090). 


4.12  history  :=  ( 
t.conception  /  t.start:  date; 

*t. announcement  /  t. paper:  date 

't.birth  /  t. prototype  /  t. operational:  date; 

't.scheduled:  date; 

't.exhibited:  date; 

*t. delivery  /  t. production:  date-list; 

't.first-delivery  /  t. first:  date; 

't.last-delivery  /  t.last  /  t.withdrawal:  date; 

't.death  /  t.last-use:  date; 

'production:  numberlt.dehvery)) 

date  :=  year  [month  year]  day  month  year]  quarter  year 
quarter  /  q  :  =  winter  /  1 1  spring  /  2 1  summer  /  3 1  fall  /  4 

The  history  of  the  component  is  viewed  as  a  series  of  event  dates,  only 
the  more  important  being  given  above.  Often  the  same  essential  function 
is  served  by  a  variety  of  events  (e.g.,  the  announcement  of  a  computer  to 
the  public  can  be  made  either  by  formal  armouncement,  as  happens  with 
commercial  systems,  or  by  a  technical  paper).  Delivery  or  production  re- 
fers to  the  actual  placing  of  systems  and  consists  of  a  series  of  dates,  one 
for  each  instance  produced.  This  series  is  normally  abbreviated  to  the  first 
and  last  deUvery,  plus  the  number  produced.  None  of  the  attributes  be- 
yond t.start  need  exist,  as  a  computer  system  can  be  aborted  at  any  time. 
For  all  attributes,  the  dates  may  be  known  only  approximately. 

4.13  Weight,  power,  volume,  area,  temperature.  Since  we  concentrate  on 
the  informational  aspects  of  components,  other  attributes  are  mentioned 
only  briefly  (and  others,  such  as  decor,  are  left  out  entirely).  The  values  of 
these  parameters  are  especially  important  in  aerospace  applications.  They 
also  show  the  effects  of  technology  on  packaging  and  computing  power 
per  unit  volume. 

5.  Link 

.5.1    Link/L:  =  simple-link  |  compound-link 

.5.2    simple-link  :  =  component  ( 

coconiponents:  (input:  component,  output:  component,  initiators: 
input  I  output  I  both); 

subcomponents:  ('control:  'input-buifer:  M.i-unit;  'output-buffer: 
M.i-unit); 

concurrency:  1; 


620  Appendix 


concurrency-type:  simplex; 

information-rate/  i-rate:  (i-unit/operation)  X  o-rate  [i/t]; 
i-unit:  i-unit(input)  equals  i-unit(output); 
delay  /  t.delay  /  td:  [t]; 
carrier) 

A  simple-link  has  the  capability  of  moving  an  i-unit  from  the  input 
cocomponent  to  the  output  cocomponent.  The  simple-link  has  two  simplex 
ports  that  connect  to  the  ports  of  the  two  cocomponents  and  are  sepa- 
rated by  a  delay.  In  essence,  as  the  delay  goes  to  zero,  the  input  port  and 
output  ports  become  one.  Initiation  of  the  transmission  may  be  fixed  at 
one  end  or  the  other  or  be  from  either  end,  depending  on  the  design  of 
the  link.  The  base-unit  is  usually  a  bit  (i.e.,  two  states),  but  it  may  be 
more.  The  width  of  the  i-unit  is  the  number  of  base-units  transmitted  in 
parallel;  and  the  length  is  the  number  of  widths  serially  transmitted  in  one 
operation.  A  simple-link  permits  transmission  in  one  direction  only  (from 
input  to  output  cocomponent);  this  is  normally  called  a  simplex  link.  The 
port-to-port  delay  is  the  time  from  the  initiation  of  the  transmit  operation 
at  one  port  to  the  arrival  of  the  i-unit  at  the  second  port.  (Occasionally, 
the  arrival  time  between  widths  can  be  relevant  operationally,  and  then 
a  more  precise  characterization  of  the  time  structure  would  be  required.) 
The  rate  of  transmission  (the  information  rate)  may  be  calculated  by  taking 
the  operation  rate  times  the  infonnation  transmitted  per  operation  (i.e., 
the  content  of  the  i-unit).  Links  may — but  need  not — contain  buffering  at 
either  end  for  a  single  i-unit.  There  may  be  a  distinct  control  involved, 
especially  if  initiation  and  termination  rituals  must  be  accomplished;  but 
it  is  possible  to  have  links  that  are  simple  wires  and  simply  present  at  the 
output  terminal  what  was  presented  at  the  input. 

EXAMPLE    L  [input:  register  A;  output:  register  B;  width:  36  b;"| 
Ll  megawords/s  J 

5.3    compoiuid-link  :  =  ( 

simple-link(concurrency:  1;  concurrency-type:  half-duplex)  | 

simple-link(concurrency:  2;  concurrency-type:  full-duplex)  | 

simple-link{concurrency:  -(-integer;  concurrency-type;  broadcast; 
output:  component-set)  I 

simple-link(concurrency:  -(-integer;  concurrency-type:  network  broad- 
cast; input:  component-set;  output;  component-set)  | 

simple-link( concurrency;  +  integer;  concurrency-type:  star)  | 

(simple-link)-set) 

A  compound-link  is  made  up  of  several  links,  but  such  that  no  switch- 
ing occurs.  A  half-duplex  link  permits  information  to  flow  from  either 
terminal  to  the  other,  but  transmission  is  possible  in  only  one  direction  at 


a  time — which  thus  leads  to  a  turnaround  delay  time.  A  full-duplex  link 
permits  simultaneous  transmission  in  both  directions.  Broadcast  links  per- 
mit transmission  to  many  receivers;  thus  the  output  components  can  be 
set.  Network  broadcast  permits  more  than  one  terminal  to  be  a  source, 
though  only  one  at  a  time.  The  star  denotes  all  n  components  of  a  set  to 
simultaneously  communicate  with  one  another  via  (n/2)  X  (n  — 1)  full- 
duplex  links. 

Finally,  a  set  of  disjoint  links  (that  is,  inputs  disjoint  and  outputs  dis- 
joint) can  be  considered  to  be  a  single  link.  This  latter  is  essentially  a 
convenience  for  naming  a  multiplex  link. 

EXAMPLES    L  rDataphone;  1800  b/s;  half-duplex;  i-unit:  (length:  8,] 
[width:  I  b)  J 

L(Telephone;  i-rate:  110  b/s:  direction:  full-duplex) 

Telephone  :=  L(I10  b/s;  full-duplex)       alternative  fonn 

I/O  Bus  :=  L  rhalf-duplex;  i-unit:  1  w;  12  b/w;] 
Loperation-rate:  .500  ko/s  J 


L  r'l/O  Bus;  half-duplex;  i-unit:  1  w;] 


Ll2 


b/w;  500  kw/s 

r'l/o  I 

Ll2  b; 


Lpl/O  Bus;  half-duplex;  i-unit  (length:! 

width:  1  b);  6  megabits/s  J 


alternative  form 
alternative  form 


6.  Memory 

6.1  Memory  /  M  :=  simple-memory  |  compound-niemorv 

6.2  simple-memory  :  =  component  ( 
cocomponents:  read:  component,  write:  component; 

■  functions:    see  Table  I; 
subcomponent:  control; 

word  /  w:  i-unit  [i];  ^ 
size:  1  word  [i]; 

operations:  (read |  write  |  read,  write); 

information-rate  /  i-rate:  [i]  /  word  X  operation-rate  [i/t]; 


A. 


access-time  /  ta;  constant]  —constant  [t]; 
cycle-time  /  tc:  time(read;  next  write)  [t]; 


—  permanency:  (decay  |  fast-read-slow-write  /  frsw  |  permanent  /  read- 
only /  ro  /  ros  /  ROS  /  read-only-memory  /  rom  /  ROM  |  ' 
read-destnict  |  read-regenerate  /  rr  |  read-write  /  rw  |  write-only)  [t]; 

portability:  (portable  /  p  |  not  portable  /  fixed  /  f); 

technology:    see  Table  2) 


Appendix  621 


Table  1    Memory  functions 


Within  C 
primary  /  p 


secondary  /  s 


Within  P,  K, 


acjaress 

buffer  synchronizer 
control 

data  operands 
fixed 

error  detection 
error  accounting 

instruction 

processor  state  ps 

program  state  word 

process  map 

process  registers 

program  address 
instruction  address 
instruction  location 
counter  program 

working  temporary 
Withlri_L_L 

buffer  /  synchronizer 

control 

working  /  temporary 
Within  D 
control 


Primary  memory;  holds  directly  execut- 
able programs;  instructions  and  data 
for  instructions  are  taken  from  Mp 
and  it  must  be  directly  accessible  by  P 

Secondary  memory.  In  which  data  acces- 
sible to  the  ISP  is  stored;  programs  are 
not  executed  from  secondary:  normally 
Ms  is  much  larger  than  Mp  (and  much 
slower);  Ms  holds  files,  programs 
(waiting  to  be  executed),  data.  etc. 

Holds  operands 

Holds  data  while  synchronizing  with  an- 
other component 

Used  during  instruction's  interpretation; 
state  of  a  K 

Holds  information  that  are  operands  or 
eventual  operands 

Used  to  define  permanently  the  nature 
of  a  processor  or  a  control 

Holds  detected  error  information,  nor- 
mally hardware  errors 

Holds  counts  of  errors;  normally  part  of 
Mps;  two  major  types  or  errors,  machine 
(or  hardware)  errors  and  process  (or 
program)  errors,  are  accounted 

Holds  parts  of  instruction  as  it  is  being 
interpreted 

Includes  all  registers,  state  bits,  and  in- 
struction counter  associated  with  ISP; 
includes  the  following  subcomponents; 

Holds  the  state  of  the  program  flow,  over- 
flow bits.  I.e.,  the  instruction  or  pro- 
gram counter,  and  any  state  bits 
accessible  to  a  program 

Used  to  locate  programs  within  Mp  (and 
Ms) 

Specific  arithmetic  and  indexing  registers 
(e.g.,  AC.  MQ,  general  registers,  stack) 

Holds  pointer  to  either  the  current  or  the 
next  instruction  the  processor  is  to 
interpret 

Holds  intermediate  results 

Used  for  synchronizing  purposes 
The  K  part  of  T  or  L 
Temporary  results 

K  part  of  D 


data  operand 

instruction 
working  temporary 

Within  S 
address 

buffer  synchronizer 
control 


D  may   stack  operands  and  results, 
synchronizing  with  some  other  process 
Current  operation  D  is  performing 
Temporary  results  of  intermediate  data 


Position  of  switch,  i.e.,  the  information 
that  holds  gate-switches  open  or  closed 

Any  synchronizing  storage  needed  within  ■y'^^ 
S  for  links  (A  V^^'' 

The  K  part  of  S  ' 


Table  2    Memory  technology 


itable 


Access^ 


 ^  

Porta-  Pemta- 
bHity\  nency\ 


capacitor 

r 

decay 

core    magnetic  core 

r 

rr 

bulk  core    large  core  storage  Ics 

r 

rr 

extended  core  storage  ecs 

r 

f  rsw 

delay  line    magnetostrictive  delay  line 

c 

mercury  delay  line 

c 

optical  delay  line 

c 

rr 

disk  diskpak 

1, 

c 

rw 

fixed  head  disk 

1, 

c 

rw 

moving  head  disk 

c 

drum    fixed  head  drum 

c 

rw 

moving  head  drum 

1, 

c 

rw 

electrostatic  storage  tube 

r 

decay 

integrated  circuit  array 

r 

content  f 

rw 

logic    technology       See  PMS  4.6  for 

r 

rw 

/og/c  used  to  make  active  bit,  register 

and  iirraij  memories 

magnetic  card       e.g..  DataceU 

1, 

c 

P 

rw 

magnetic  tape  tape 

1 

P 

rw 

addressable  magnetic  tape 

b 

P 

rw 

carousel  magnetic  tape 

c, 

1 

P 

rw 

magnetic  wire 

1 

P 

rw 

photographic  store       e.g.,  photostore 

1, 

r 

P 

wlro 

film  (write  once) 

1, 

r 

P 

wiro 

plasma  display       readabditij:  both 

r 

f 

rw 

thin  film 

r 

f 

rw 

Machine  readable-  read-onh/;  nonportable:  random 

access 

capacitor  array 

r 

f 

ro 

diode  array 

r 

f 

ro 

inductor  array 

r 

f 

ro 

rope    transformer  coupled  braided 

r 

f 

ro 

rope  resistor 

r 

f 

ro 

1 

r 

622  Appendix 


If 


Memories  which  cannpptfe-hoih  read -and- ivritten  hi/  a 

machine 

• 

vvritd' 

Reada- 

Perma- 

bility 

bility 

Access 

nencij 

badge 

b 

b 

ro 

card    punched  card 

m  1  b 

wiro 

credit  card 

b 

b 

cathode  ray  tube  /'  CRT 

h 

decay 

storage  CRT 

m 

h 

wo 

cj^rmpnt  tao 

wIro 

h 

5 

'' 

rw 

keys  /  pushbuttons  keyboard 

h 

b 

'' 

rw 

knobs 

h 

b 

rw 

page  /  impact  printed  page  '  paper 

m 

b 

braille  page 

m 

h 

wlr° 

handprinted  page 

h 

b 

wir 

handwritten  page 

h 

h 

wIro 

magnetic  ink  page 

m 

b 

wiro 

thermal  page 

m 

b 

wIro 

typewritten  page 

b 

b 

wiro 

xerographed  page 

m 

b 

1 

wIro 

paper  tape  ,'  punched  paper  tape 

m|  b 

mib 

wIro 

plot  /  incremental  point  plot 

m 

h 

wIro 

analog  plot  continuom 

m 

h 

r.  1 

wIro 

patchboard 

h 

b 

rw 

switches    toggle  switches 

h 

b 

rw 

(f  ' 


fSee  F*MS  6.2  for  abbreviations,  also  c/jcyiic,  l/linear,  r,  random.  i 

A  simple-memory  stores  a  single  word  of  information  by  means  of  a 
read  operation  and  delivers  that  word  on  subsequent  write  operations. 
There  is  no  addressing,  and  the  access  time  is  a  constant  (or  approximately 
so).  The  memory  is  connected  to  the  larger  system  via  one  component 
for  its  read  operation  and  one  for  its  write  operation.  These  are  usually 
links  and  need  not  be  distinct.  The  only  subcomponent  that  need  be  dis- 
tinguished in  a  simple-M  is  the  control  (though  of  course  the  word  may  be 
built  up  from  a  set  of  bit  memories).  The  information  rate  is  the  amount 
of  information  in  a  word  times  the  operation-rate.  The  cycle  time  is  the 
time  it  takes  to  read  the  memory  and  then  write  new  information  into  it; 
the  ISP  expression  (read;  next  write)  implies  a  sequential  operation.  The 
permanency  describes  what  happens  to  information  left  in  the  memory  as 
a  function  of  time.  This  concept  is  often  partially  covered  by  other  no- 
tions, such  as  reliability,  volatility,  destructive-nondestructive,  etc.  We  give 
the  main  values  that  arise  in  practice:  a  rate  of  decay  with  time  (which 
expands  to  an  actual  decay  function);  write-once-read-only  (e.g.,  cards  and 
photographs);  read-write;  fact-read-slow-write  (a  special  case  of  read-write); 
destruction  of  the  information  upon  reading;  and  permanent  or  read-only 
(as  long  as  the  system  remains  viable).  Write-only  refers  to  the  character- 
istic of  the  memory  from  the  point  of  view  of  the  system  under  discussion; 
always  there  is  some  other  system  (usually  a  human  being)  who  can  read 
the  memory.  Whether  the  memory  can  be  only  read  or  only  written 


(readability,  writability)  or  both  read  and  written,  and  by  whom  (human 
or  machine),  is  derived  from  the  port  characteristics.  Portability  denotes 
whether  information  can  be  carried  away  from  the  system  or  is  non- 
portable (fixed).  Two  of  the  parameters,  function  and  technology,  are 
extensive  enough  to  give  by  tables. 

6.3    compound-memory  ;  =  component  ( 

cocomponents:  read:  component,  write:  component,  address: 
component; 

function:       see  Table  1; 

subcomponents:  control;  address;  switch;  memory;  M-set,  'read-buffer: 

memory,  *write-buffer:  memory; 

word:  word(M.memory); 

size:  sum(word(M.memory)); 

operations:  read-set,  write-set; 

information-rate:  [i]  /  word  X  operation-rate  [i/t]; 

access-time:  access-time(S. address)  [t]  random,  ci/clic.  etc.  see  PMS  7.3; 

cycle-time:  cycle-time(simple-M); 

permanency:  permanency(simple-M); 

portability:  portability(simple-M); 

technology:       see  Table  2) 

A  compound-memory  is  a  system  of  simple-memories,  organized  by  an 
addressing  switch.  Thus  memory  is  fimdamentally  defined  recursively  as  a 
switch  to  other  memories.  At  each  switch  stage  the  dimensionality  of  the 
overall  i-uiut  is  reduced  by  one.  The  addressing  may  be  provided  by  a 
different  cocomponent  than  those  for  the  read  and  write  data.  All  the 
submemories  have  the  same  word,  and  the  size  of  the  compound-memory 
is  the  sum  of  all  these  words.  There  may  be  additional  subcomponent 
memory  within  a  memory,  such  as  buffer  memories  and  a  memory  con- 
nected with  the  address  switch  and  the  control.  However,  none  of  these 
are  available  for  storage  purposes  and  are  not  counted  in  the  size.  The 
access  time  of  the  memory  is  defined  by  the  access  time  of  the  address 
switch.  A  classification  of  these  can  be  found  imder  the  definition  of  switch 
and  is  often  used  to  classify  memories  (e.g.,  hnear,  random,  cychc,  etc.). 
Some  parameters  are  the  same  as  those  given  for  a  simple-memory,  and 
these  are  simply  cross-referenced. 

COMMENT  Not  all  conceivable  memories  come  under  the  definitions  just 
given  (e.g.,  we  have  assumed  constant  word  size);  but  in  fact  all  memories 
used  in  existing  digital  computers  do. 

EXAMPLES    Mp(core;  t. access:  2  us/w;  4096  w;  16  b/w) 

M(Exed  head  disk;  t.access:  0  ~  17  ms;  i-rate:  300  kchar/s; 
size:  1  megaword) 


Appendix  623 


7.  Switch 

7.1  Switch  /  S  :  =  gate-switch  |  simple-switch  |  compound  switch 

7.2  gate-switch  :=  component  ( 

cocomponents:  (input:  component,  output:  component:  initiators: 
component); 

subcomponents:  ('control:  'input-buffer:  M.i-unit;  'output-buffer: 
M.i-unit): 

operation:  (open  |  close); 

concurrency:  (1|2); 

concurrency-type:  (simplex  |  half-duplex  |  full-duplex  duplex); 
i-rate:  i-rate(link); 
delay:  delay(link); 
hang-up-delay:  [t]; 
access-time  /  ta:  constant  [t]) 

A  gate-switch  acts  as  a  simple-link  or  as  no  connection.  It  is  used  to  trans- 
mit information  conditionally  between  the  ports  of  two  components.  It 
can  be  used  as  a  ba.sic  primitive  to  express  the  structure  of  other  switches, 
including  the  simple-switch.  The  parameters  will  be  discussed  under  the 
simple-switch. 

7.3  simple-switch  :=  component  ( 

cocomponents:  (input  from:  component-set,  output/to:  component-set, 
initiator:  component-set); 

subcomponents:  control,  links:  link-set,  'address:  memory; 

operation:  access; 

size:  size(output(coconiponents)); 

concurrency:  -I-  integer; 

concurrencv-tvpe:(simplex  |  half-duplex  |  full-duplex/duplex  | 
dual-simplex  1  dual  half-duplex  |  dual  hill-duplex  /  dual-duplex  | 
time-multiplexed-cross-point    1  tnmk  |  cross-point  |  dual-cross-point  | 
k-trunk); 

hierarchy;  (hierarchical  j  nonhierarchical  /  anarchical); 
location:  (central | distributed  (cocomponent  set)); 
distribution:  (radial  | bussed  /  bus  /  chain  /  daisy  chain); 
access-time  /  ta:  switch-t_ype(address  /  a,  prior-address  /  p) 
switch-type  :  =  ( 


bilinear:  constant  -I-  constant  X  abs(a  —  p)  \ 
cyclic:  constant  +  constant  X  (u  —  p)  mod  (size)] 
interleave:  (a  interleave-relation  p  — »  random)-list  | 
linear:  {a  >  p—>  constant  +  constant  X  (a  —  p); 

a  <  p  — >  reset-time  +  constant  X  a)  | 
first-in-first-out  /  fifo  /  queue:  (constant  |  ~constant)  | 
^ last-in-tirst-out  /  lifo  /  stack:  (constant  |  ^constant)! 
•i-dequeue:  (constant  |  — constant)); 

permanency:  (decay  |  transmit-destruct  |  time-multiplexed  /  tmx  /  tm  | 
moving  |  cyclic  |  permanent  |  irreversible  |  fixed  until  broken  / 
fixed  I  manual); 

hang-up-delay:  [t]; 

delay:  delay(links); 

L-initiator:  initiator(links); 

technology) 

A  simple-switch  consists  of  a  set  of  potential  links  between  a  set  of 
input  and  output  components,  with  an  operation  (access)  that  can  actual- 
ize some  subset  of  the  links.  This  is  done  according  to  an  instruction  called 
the  address  (which  may  or  may  not  be  held  in  a  memory).  For  a  switch, 
the  cocomponent  input  and  output  ports  are  sometimes  listed  to  specify 
the  size  of  the  switch. 

.\n  important  parameter  is  the  concurrency-tvpe,  which  describes  the 
various  subsets  that  can  be  simultaneously  realized.  The  values  given  cor- 
respond to  practical  alternatives — simplex,  in  which  only  a  single  simplex 
link  may  be  established  at  a  time;  duplex,  in  which  a  single  full-duplex 
link  may  be  established;  cross-point  (also  dual-cross-point),  which  permits 
true  simultaneity;  time-multiplexed-cross-point,  in  which  fimctional  simul- 
taneity is  established  for  many  links  by  means  of  rapid  switching  within 
the  course  of  transmission  of  an  i-unit  (in  essence  the  time  multiplexed- 
cross-point  has  1-tnuik,  which  permits  1  conversation);  and  finally  k-trunks 
for  k-simultaneous  conversations.  We  often  use  a  duplex  switch  instead  of 
simplex  or  half  duplex  switch  in  PMS  diagrams,  even  though  the  latter 
would  be  more  accurate. 

Hierarchy  is  a  redundant  attribute  derived  from  the  cocomponent  set. 
As  a  rule,  if  there  are  n  identical  cocomponents  each  of  which  communi- 
cates with  one  another,  there  is  no  hierarchy.  A  telephone  system  is  a 
typical  nonhierarchical  structure.  Usually  the  switches  internal  to  a  com- 
puter are  hierarchical  in  that  there  are  n  components  of  type  a  which 
communicate  with  m  components  of  type  b.  The  a  s  only  communicate 
with  the  h's  and  vice  versa:  hierarchy  does  not  determine  the  component 
initiating  the  dialogue. 

The  location  of  a  switch  refers  to  whether  the  hardware  is  localized 
within  one  of  the  components  using  the  switch,  whether  it  is  separate 
(called  central),  or  whether  it  is  distributed  through  all  the  cocomponents. 

An  attribute  that  is  not  completely  independent  is  distribution,  which 
denotes  whether  the  physical  stnicture  is  a  continuous  bus  or  chain  or  is 


624  Appendix 


fed  radially  from  a  centralized  component.  See  Fig.  13,  Chap.  .3,  page  67 
for  common  alternative  physical  structures. 

A  major  way  of  classifying  simple-switches  is  by  their  access  time — 
cyclic,  linear,  random,  etc.  With  each  is  given  the  type  of  formula  that 
determines  the  actual  access  time.  The  two  critical  parameters  in  most 
switches  are  the  address  being  sought  (a)  and  the  prior  address  (p),  which 
represents  the  existing  state  of  the  switch.  Thus,  in  a  bilinear  switch  the 
access  time  consists  of  a  start-up  time  plus  a  time  proportional  to  the  mag- 
nitude of  the  difference  between  the  prior  address  and  the  desired  address. 
This  differs  from  a  linear  switch,  which  only  permits  movement  in  one 
direction  and  must  reset  to  an  initial  state  if  an  address  lower  than  the 
existing  address  (p)  is  sought.  An  interleave  memory  is  one  that  consists  of 
a  collection  of  random-access  memories,  depending  on  the  relationship 
between  a  and  p  (usually  a  modular  one,  such  as  (a  =  p  mod  4)  — » long 
access;  a  p  mod  4^  short  access).  Random  access  means  that  the  access 
time  is  independent  of  both  a  and  p.  This  constancy  may  be  only  approxi- 
mate (as  in  using  a  drum  with  its  cyclic  character  ignored).  Queues  and 
stacks  differ  from  the  other  switches  in  having  a  degenerate  addressing 
system  such  that  the  next  link  selected  is  determined  by  the  state  of  the 
switch  itself.  Dequeues  allows  either  of  the  two  ends  of  a  queue  to  be 
accessed. 

Permanency  refers  to  how  long  the  switch  maintains  a  hnk  (or  set  of 
them)  after  establishing  the  link  by  an  access  operation.  The  three  com- 
mon values  are  (1)  the  destruction  of  the  connection  with  the  transmission 
of  the  i-unit  across  the  link,  (2)  the  maintenance  of  the  connection  perma- 
nently, and  (3)  the  autonomous  movement  of  the  connection  (as  in  disks 
and  drums).  The  latter  two  give  rise  to  the  p  used  in  the  access  formulas. 
Rarer  is  a  decay  function,  in  which  the  link  remains  established  for  some 
period  of  time,  or  an  irreversible  connection,  which  can  be  set  just  once 
and  from  then  on  operates  like  a  simple-link. 

Hang-up  delay  is  the  time  taken  to  break  a  connection  after  the  appro- 
priate i-unit  has  been  transmitted.  Hang-up  delay  is  given  only  for  certain 
permanencies  of  fixed-until-broken  and  manual  switches. 

A  number  of  parameters  derive  directly  from  the  properties  of  the  set 
of  ports  or  links — the  size  of  the  i-unit,  the  information-rate,  the  link  de- 
lay, the  direction  of  data  flow,  and  the  component  that  can  initiate  data 
transmission  (as  opposed  to  initiating  accessing).  Finally,  there  is  tech- 
nology, which  is  not  given  in  detail,  since  much  of  it  is  identical  to 
memory  technology. 

EXAMPLES    S('I/0  BUS;  location:  K;  from:P;  to:K;  half-duplex;  initiators: 
P,  K;  switch-type:  random;  ta:  5/JS;  concurrency:  1) 

S(cross-point;  16  M;  6  (P  -|-  K);  concurrency;  6;  location;  M) 

7.4    compound-switch  :  =  simple-switch  ( 

subcomponents:  control,  links:  link-set,  subswitches:  switch-set, 
'address:  memory; 

access-time:  (cascade:  sum(access-time(subswitches))  | 
parallel:  max(access-time(subswitches)) ) ) 


A  compound-switch  is  an  array  of  switches  whose  links  are  connected 
so  that  the  outputs  of  some  are  inputs  to  others  and  thus  effects  a  total 
set  of  links,  which  go  from  output  to  input  component-sets.  It  can  be 
defined  as  an  extension  of  a  simple-switch,  since  most  parameters  are 
defined  identically  for  both.  Many  combinations  of  accessing  arrangements 
are  possible.  The  two  most  common  are  given  above.  A  cascade-switch  is 
one  in  which  each  accessing  of  the  next  subswitch  must  take  place  after 
the  prior  one  so  that  the  access  times  add.  A  parallel-switch  makes  all  the 
accesses  simultaneously,  so  that  the  total  access  time  is  simply  the  access 
time  of  the  subswitch  that  takes  longest.  (In  both  cases,  there  can  be  ad- 
ditional overhead  time,  but  this  can  usually  be  allotted  to  the  subswitches 
and  does  not  require  separate  terms  in  the  expressions  for  access  time.) 

8.  Control 

5.1  Control  /  K  :  =  simple-control  |  compoimd-control 

8.2  simple-control  :  =  component  ( 

cocomponents:  controlled  /  object:  component-set,  'instruction: 
component-set,  'data:  component-set; 

subcomponents:  'instruction:  memory,  working  /  w:  memory, 
operations:  data-operation; 

operations:  evoke  /       next-evoke  /  next,  condition-operations; 

controlled-operations:  (controlled-component:  operation)-list; 

instruction-source:  (none  |  data  |  instruction); 

instruction-set) 

A  simple-control  is  a  logical  circuit  (usually  sequential)  that  evokes 
operations  in  other  components  (the  controlled,  or  object,  components). 
Thus,  its  main  operations  are  those  of  evoking  and  evoking-next  (symbol- 
ized as  —>  and  next  in  ISP).  However,  it  must  also  detect  conditions  on 
which  such  evoking  depends,  so  that  it  has  available  additional  operations, 
that  are  combined  in  an  instruction-set  (see  ISP  2.1).  These  vary  greatly 
in  complexity,  from  boolean  operations  to  arithmetic  operations  (such  as 
counting  the  number  of  i-imits  processed). 

A  major  distinction  is  the  source  of  the  external  instructions  that  can 
be  given  the  control.  At  one  extreme  there  may  be  none,  as  in  a  clock 
whose  function  is  to  internipt  the  system  every  millisecond.  The  common 
case  is  that  in  which  all  the  external  instruction  comes  via  the  data  itself. 
More  complex  controls  have  a  separate  set  of  external  instructions  (often 
called  control  characters  or  commands).  A  control  does  not  obtain  its  own 
next  instruction,  being  dependent  on  an  external  component  to  set  it  into 
action.  This  is  the  primary  characteristic  that  distinguishes  it  from  a  proc- 
essor. It  does  have  an  instruction-set,  which  is  the  ISP  expression  that 
shows  what  conditions  evoke  what  actions. 

No  technology  is  given,  since  controls  are  all  realized  in  a  logic  tech- 
nology, as  given  in  the  definition  of  component.  Likewise,  no  function 
parameter  is  given,  since  there  exists  no  special  vocabulary  to  designate 
the  different  subspecies  of  control  tasks. 


Appendix  625 


EXAMPLES    K(Mp;  input:  Pc;  output:  Mp) 
K(D(multiply)) 

8.3    compound-control :  =  simple-control  ( 

subcomponents:  alternatives:  simple-K-set,  'instruction:  memory, 
working:  memory; 

instruction-source:  mode-instructions) 

A  compound-control  consists  of  a  collection  of  alternative  simple-controls 
and  can  be  given  as  an  extension  of  the  simple-control.  At  any  time,  the 
control  is  one  of  these  simple-controls.  Determination  of  what  simple- 
control  is  operative  (often  called  the  mode  the  control  is  in)  is  by  a  mode- 
instruction  from  some  external  component.  This  additional  freedom  re- 
quires a  subcomponent,  the  control-state,  to  hold  the  current  specification. 
(Thus  it  is  possible,  though  rare,  that  the  actual  simple-K  is  detennined 
by  a  sequence  of  mode-instructions,  each  determining  some  part  of  the 
control  state.) 

EXAMPLE  K( Instruction  set  processor/ISP;  input:M.processor„state;  out- 
put: D,  K(Mp),  K(L('I/0  Bus));  M(read-write;  40  b;  working); 
M(read  only;  100  w;  36  b/w  1  fis/w)) 

9.  Transducer 

9.1  Transducer  /  T  :  =  simple-transducer  |  compound-transducer 

9.2  simple-transducer  :  =  component  ( 

cocomponents:  input:  component,  output:  component,  initiator: 
(input  I  output  I  both): 

subcomponents:  input:  L,  output:  L,  'control; 

functional-name:  (input:  reader  /  sensor  /  pen  /  receiver;  output: 
writer  /  punch  /  perforator  ,  display  /'  printer  /  transmitter; 
synchronizer  isolator;  transducer); 

operation:  transduce  (plus  transmit)  /  <— ; 

carrier       See  port  of  component- 

transduction:  port( output)  <— port(  input); 

divergence:  i-unit(output)  —  i-unit(input)  [i]; 

divergence-rate  /  divergence  X  o-rate  [i/t]; 

portabihty:  (portable  |  not  portable  /  fixed); 

concurrency-type:  simplex; 

concurrency:  1; 

transduction-technology  :  =  (ampUfication  |  analog-digital  |  angular- 
linear  I  attentuation  |  electroluminescence  |  electromagnetic  | 
electromechanical !  electromechanical-acoustic  |  electro-optical  | 
mechanical-indentation  |  photochemical  |  xerographic) 


transducer-technology  :=  (analog-digital  converter  |  bell  |  buzzer  |  TV 
camera/  vidicon|card  reader] card  punch] CRT  display | storage 
CRT  display  I  plasma  display]. 3  D  display  |  printed  document 
reader  /  document  reader  |  document  printer  |  magnetic  character 
document  reader  |  film  reader  |  film  |  writer  |  gong  |  joystick  |  keys  | 
keyboard  I  light  gun  |  light  pen  |  continuou.s  line  plotter]  line  printer/ 
printer  i  linear  actuator  |  SRI  mouse  [paper  tape  reader]  paper  tape 
punch  j  incremental  point  plotter] pressure  transducer ] speech 
svnthesizer  ]  Rand  tablet  ]  Sylvania  tablet  ]  telephone  dial  ]  push 
button  telephone  dial ] thermocouple ] Lincoln  Laboratory  Wand)) 

A  simple-transducer  is  a  pair  of  connected  Unks  that  have  different  i-units 
and/or  underlying  carriers.  As  defined  above,  transduction  is  a  digital  op- 
eration, taking  in  an  i-unit  of  the  input  link  and  producing  an  i-unit  of  the 
output  link.  Meaning  is  preserved;  that  is,  only  the  encoding  has  changed. 
Preservation  of  meaning  distinguishes  transduction  from  data  operation. 
The  amount  of  information  need  not  be  preserved,  so  that  information 
divergence  is  an  additional  characteristic  of  a  transducer.  It  may  be  posi- 
tive or  negative,  as  the  net  number  of  bits  is  either  increased  or  decreased. 

A  simple-transducer  is  called  a  simplex,  in  that  information  flow  is  in 
one  fixed  direction  only  (as  in  a  simple-link). 

Knowing  the  function  of  the  transducer  permits  an  inference  of  whether 
one  interface  of  the  transducer  involves  a  human  being.  This  inference 
can  be  derived  from  the  port  characteristics. 

EX.AMPLE    Tiline  printer;  l(XX)  lines/m;  1.32  char/line;  8  bit /char) 
T(paper  tape;  reader;  300  char/s;  8  b/char;  width:  1  in.) 
T(sense  amplifier;  i-rate:  .5  w/s;  24  b/w;  input:  M(memory 
stack)) 

9.3    compound-transducer  :  =  ( 
simple-transducer-set ; 

concurrency-type:  (half-duplex  ]  full  duplex); 
compound-transducer-technology; 

concurrency:  -t- integer) 

compound-transducer-technology  :  =  card  reader-punch  ]  computer 
console  /  processor  console  /  console  ]  Dataphone]  keyboard-CRT 
display ] diskpak  drive] film  write-reader ] magnetic  card  transport] 
magnetic  tape  transport  ]  typewriter  ]  Teletype  ]  special  piu-pose 
console  :  =  (airlines  reservations  ]  stock  quotation  ]  data  collection) 

A  compound-transducer  consists  of  a  set  of  simple-transducers.  The  two 
simplest  kinds  are  the  half-duplex  and  the  full-duplex,  which  are  extensions 
of  the  simple-transducer,  wherein  the  direction  of  information  flow  can  be 
either  way  but  only  one  way  at  a  time  (half-duplex)  or  can  be  both  ways 
simultaneouslv  (full-duplex).  The  more  general  case  is  simply  a  set  of  trans- 
ducers with  independent  inputs  and  outputs  (so  that  overall  there  is  no 
switching  function).  It  is  common  to  call  this  a  multiplexed  transducer  in 
which  concurrency  is  specified  by  an  integer. 


626  Appendix 


EXAMPLES    T.half-duplex{typewriter;  15  char/sec;  output:  paper,  video, 
audio;  input:  keyboard;  88  char;  8  b/char) 
T.multiplex(console;  keyboard,  display,  printer) 


10.  Data-operations 

10.1  Data  operations/D  :  =  simple-data-operation  |  compound-data-oper- 
ation 

10.2  simple-data-operations  :  =  component  ( 

cocomponents:  inputs:'  components,  output:  component,  initiator; 
input; 

subcomponents:  working:  M-set,  coTitrol:  K-set; 
operations:       see  ISP  data-operations,  ISP  3.1; 
operation  time:  [t]; 
concurrency-type:  simplex; 

data-types:  data-type(operations)       .see  ISP  data-types,  ISP  1.3) 

A  data-operation  creates  information  (i.e.,  new  instances  of  data-types) 
that  has  new  meaning.  It  usually  does  this  as  a  fimction  of  input  informa- 
tion (e.g.,  a  floating  point  multiply  which  creates  a  floating  point  number 
that  represents  the  product  of  the  two  input  numbers).  It  may  or  may  not 
destroy  some  existing  information  (e.g.,  a  tally  operation,  which  modifies 
the  existing  number  in  creating  the  new  one).  A  data  operation  differs 
from  a  transducer  (T),  since  its  output  differs  in  meaning  from  its  input. 
The  T  preserves  meaning,  while  changing  representation. 

The  data-operation  takes  the  data-type  i-units  at  the  input  ports,  oper- 
ates on  the  data,  and  presents  the  result  at  the  output  port.  The  simple- 
data-operation  can  perform  only  one  operation  at  a  time.  The  simplest  D 
is  just  a  set  of  transfer  paths  between  registers  for  performing  some  oper- 
ation on  a  boolean  vector  (that  is,  A  A  B,  A  ®  B,  — |A)  or  a  combinational 
network  (that  is,  X  =  0).  Slightly  more  complex  D's  are  the  additive  op- 
erations on  integers  ( -I- ,  — ).  Operations  like  X ,  /  are  usually  constructed 
from  more  primitive  D's,  -|- ,  — ,  and  (/2),  with  a  subcontrol  (K)  to  step 
through  the  various  substeps  of  the  arithmetic  algorithm.  Finally,  a  float- 
ing point  multiply  would  be  formed  as  a  sequence  of  simple-data-opera- 
tions controlled  by  one  or  more  common  subcontrols. 

EX.\MPLE    D  roperation:  -I-;  data-type:  fixed;  i-unit:  .32  b;"| 
Loperation-time:  .2  /is  J 

Drfloating  point  multiplier;  data-type:  f;  i-unit:  .36  b;"| 
Loperation-time:  2.0  fis;  M. working  (3  X  36  -H  10)b  J 

10.3  compound-data-operation  :  =  simple-data-operation( 
subcomponents:  alternatives;  simple-data-operation-set; 


instruction:  memory; 
concurrency:  -)-  integer; 

instruction-source:  data,  instnictions,  operator  instruction) 

A  compound-data-operation  consists  of  a  collection  of  alternative  simple- 
data-operations.  Thus,  a  compound-data-operation  is  compound  either  in 
time,  by  having  many  varied  operations  which  can  be  selected  sequen- 
tially, or  in  space,  by  having  many  separate  operations  which  can  perform 
in  parallel. 


EX.\MPLE 


arithmetic  imit;  data-type^ linteger,  floating,  boolean  vectoi-^  * 
operations:       — ,  X,  /  A,  V ,  @, —,,  normalize;  operation- 
time:  1  ^  2.0  fis;  input:  2  X  36  b;  output;  .36  b;  M. working: 
L-4  X  36  b 


11.  Processor 

11. 1  Processor  /  P  :  =  simple-processor  |  complex-processor 

11.2  simple-processor  :=  component  ( 

cocomponents:  primary:  M-set.  'secondary:  M-set,  controlled: 
component-set; 

function:  (microprogram  |  central  /  general  purpose  /  c  |  input-output  /' 
io  [  display  |  array  |  vector  move  |  special  algorithm  |  language) 

subcomponents:  (interpreter:  K;  data-operations:  D-set;  M. processor- 
state  /  ps:  see  PMS  Table  1;  M. non-processor-state:  see  PMS 
Table  1; 

operations:  operations( data-operations),  operations(cocomponents) 
see  ISP; 

data-types:  data-tvpe(operations)       see  ISP; 
cycle-time  /  tc:  cycle-time(Mp); 
i-rate:  i-rate(Mp); 

concurrency:  (o-rate  /  cycle-time)  [o]; 
program-switching-time:  [t]; 
interrupt-response-time;  [t]; 
instruction-set       see  ISP  2.1; 

instruction-efficiency:  (operations  /  instruction)  /  instruction-size  [o/i]; 

algorithm-encoding-efBciency:  (sum(data  i-units/[t])/ 

sum(data  i-units  H-  instmctions)/[t])); 

instruction-size:  [i]; 
operation-code-size:  [i]; 
address-size:  [i]; 


Appendix  627 


addresses-per-instrnction:  (0  address  /  stack  1 1  address  /  1 1  1  +  index  ' 
(1  +  x)|  1  +  general  register  address  /  (1  +  g)|2  address] 3  address] 
n  +  1  address  I  compound)) 

A  simple  processor  is  always  associated  with  a  memory  (its  primary  mem- 
ory), which  holds  the  program  (and  usually  the  data)  for  the  processor. 
In  addition,  there  may  be  secondary  memories  and  also  other  components 
that  are  controlled  by  the  processor. 

The  processor  often  hmctions  as  the  main  component  of  an  essentially 
isolated  system  (often  called  stand-alone):  it  is  then  a  central  processor,  Pc. 
Processors  also  occur  a-s  more  specialized  components  in  larger  systems:  e.g., 
to  manage  input/output  (Pio)  or  display  (P. display)  or  to  do  a  subset  of 
data-operations  efficiently  (P.data,  P.vector_m<)ve,  P.array,  or  P.special_- 
algorithm).  Processors  are  sometimes  built  in  hierarchy,  using  one  processor 
to  perform  the  interpretation  and  operations  of  another.  Such  processors 
have  become  known  as  microprogram  processors. 

The  distinguishing  feature  of  a  processor  is  that  it  determines  its  own 
next  instruction.  The  control  that  does  this  is  called  the  interpreter.  The 
repertoire  of  operations  of  the  processor  is  partly  a  set  of  data-operations 
performed  by  its  own  subcomponents  and  partly  the  set  of  operations 
proper  to  a  set  of  transducers,  memories,  links,  and  switches  external  to 
the  processor  but  incorporated  into  its  operation  code.  The  operations  are 
largely  determined  by  the  set  of  data-types  (see  the  ISP  section). 

A  processor  may  have  considerable  internal  memory  (called  the  proc- 
essor state,  Mps).  Besides  the  instruction  and  instruction-address  registers, 
which  are  necessary  for  interpretation,  there  may  be  various  amounts  of 
status  information,  accumulators,  index  registers,  general  registers,  and 
accumulator  stacks.  No  one  system  has  all  of  these  memories,  since  they 
often  provide  alternatives  to  each  other  (e.g..  index  registers  and  general 
registers). 

Each  of  the  operations  has  its  oun  operation  time  and  its  own  possi- 
liilities  for  being  overlapped  with  other  operations.  Several  parameters  are 
given  that  summarize  this  array  of  information:  the  cycle-time  of  Mp, 
which  in  the  long  mn  limits  the  rate  at  which  instnictions  and  data  can 
be  accessed  (and  also  determines  the  maximum  throughput):  the  concur- 
rency, which  tells  how  many  operations  can  be  performed  per  cycle  time 
(this  requires  an  averaging  of  the  various  possibilities  as  given  in  the  in- 
struction set);  and  the  program-switching  time,  which  is  the  time  required 
to  change  context  from  one  program  to  another.  In  simple  operating  re- 
gimes (standard  batch  processing)  program-switching  time  is  not  an  impor- 
tant parameter:  it  becomes  so  when  interrupts  are  permitted.  For  inter- 
rupts, the  response  time  is  critical.  It  is  the  time  between  when  a  request 
is  made  and  when  the  request  is  acknowledged  by  P.  The  instruction  set 
is  really  an  entry  point  to  the  ISP  description  of  the  processor.  One  might 
give  here  simply  the  number  of  instructions,  but  this  can  be  a  very  mis- 
leading number,  since  many  variations  of  a  basic  instruction  can  be  counted 
thus  giving  highly  erroneous  results.  The  algorithm-encoding-efficiency  is 
the  ratio  of  i-imits  used  for  data  per  unit  time  to  the  nimiber  of  accesses 
for  data  -I-  instructions  per  unit  time.  This  efficiency  is  strongly  affected 
by  the  address  size,  which  is  usually  the  address  size  of  the  Mp  but  need 


not  be  if  a  processor  uses  an  incremental  or  relative  addressing  system. 
The  ratio  can  be  measured  at  many  levels  of  the  ISP:  instruction-bv- 
instruction,  on  a  subroutine,  or  for  a  whole  program.  In  a  simple  computer, 
this  ratio  is  near  '/,.  Vector  operations  can  allow  a  ratio  much  closer  to  1. 

Common  measures  for  the  instructions  give  the  size  of  the  operation 
code,  the  address,  and  the  instruction.  The  addresses  per  instruction  is  one 
of  the  best  parameters  to  indicate  the  overall  structure  of  the  instruction 
set  and  is  called  the  instruction-type.  It  ranges  from  0  addresses  (systems 
which  execute  a  sequence  of  operations)  through  1,  2,  and  .3  addresses  per 
instruction  to  variable  number  of  addresses.  Between  1  and  2  addresses  lie 
index  register  (1  -I-  x)  and  general  register  (1  -I-  g)  machines.  In  a  special 
class  is  the  (n  -(-  1)  organization,  which  involves  an  additional  address  to 
obtain  the  next  instruction:  it  can  be  added  to  any  other  organization. 

E.\.\MPLES    Pc('DEC  PDP-8;  1  address  /  instniction:  ^2  w/ instniction: 
12  b/w;  1.5,  3.0,  4.5  jis  /  instruction) 
PioCIBM  7909:  500  kw/s:  data-types:  words:  integer;  1  ad- 
dress /  instruction:  36  b/w  ) 

11.3    complex-processor  :  =  simple-processor  ( 

Mp-concurrency:  (1  P|l  P  with  interrupt |1  program  with  multiple 
concurrent  subprograms 1 1  Pc  -  n  Pio | monitor  +  1  user  program] 
monitor  +  1  swapped  program  ]  fixed  multiprogramming] 
multiprogramming]  segmented-programming); 

multiprogramming  :  =  (no  relocation  ]  protect  only  ]  1  segment  ] 
2  segment  /  pure  ]  impure  segments  ]  >  1  segments  ]  paging) 
segmented-programming  :=  (fixed  length  page  segments] 
multiple  length  page  segments ] variable  length  page  segments] 
named  segments); 

P-concurrency:  (serial  /  serial  by  bit  ]  parallel  /  parallel  by  word  ] 
multiple  instruction  streams  ]  multiple  data  streams  (arrays)  ] 
pipeline  processing  ]  instniction-memory  ); 

instruction-memory  :  =  (none]l  instniction  look  ahead  ]n  instniction 
look  ahead  I  cache  '  look  aside  /  slave  memory)) 

.\  complex  processor  is  often  an  extension  of  a  simple  processor  along  the 
dimension  of  memory  mapping,  since  a  processor  is  already  a  highly  struc- 
tured and  "complex"  component. 

Note  that  a  collection  of  processors  does  not  constitute  a  compound 
processor  in  a  way  similar  to  other  PMS  components:  hence,  we  denote  a 
general  collection  of  processors  as  a  computer.  Thus,  a  complex  processor 
can  be  written  in  terms  of  a  sinipIe-P  with  new  values.  The  central  proc- 
essor using  a  microprogrammed  processor  contains  a  specialized  processor 
as  a  subcomponent  (P. microprogram). 

Three  attributes  separate  a  simple  processor  from  a  complex  processor: 
Mp-concurrencv,  P-concurrency,  and  instruction-memory.  In  essence,  the 
simple  processor  has  no  Mp  concurrency  (interpreting  a  single  program) 
and  serial  or  parallel  P  concurrency,  with  no  instniction-memory  (buffer- 


628  Appendix 


ing  for  multiple  instructions).  These  attributes  are  independent  of  one 
another  and  are  discussed  in  Chap.  .3. 

12.  Computer 

12.1  Computer  /  C  :=  simple-computer | compound-computer | network 

12.2  simple-computer  :  =  component  ( 
structure:  lPc|l  Pc. interrupt; 

subcomponents:  Pc,  Mp-.set,  'controlled:  component-set(Pc); 
cocomponents:  none; 

function:  (scientific  |  business  data  processing  |  general  purpose  |  process 
control/  control  I  communication  :=  (switching  |  store  and  forward)] 
terminal  control  /  input-output  /  io  |  display  |  file  processing  /  file 
control  I  time-sharing); 

access-time:  access-tinie(Mp); 

cycle-time:  min(cycle-time(Mp)); 

access-type:  access-type(Mp.min); 

instmction-type:  instruction-type(Pc)) 

A  simple  computer  consists  of  a  single  Pc  (possibly  with  internipt  capa- 
bility) with  an  Mp  (possibly  a  set  of  them)  plus  some  set  of  transducers, 
Ms's,  switches,  and  controls.  It  is  a  complete  system  that  can  stand  alone 
and  accomplish  processing  for  a  wide  variety  of  fimctions. 

Almost  all  of  its  significant  parameters  are  derived  from  those  of  the 
Pc  or  the  Mp  (using  the  Mp  with  the  minimum  cycle  time  if  there  are 
several  Mp's). 

EXAMPLES    C('Whirlwind  I:  Mp(core;  S/is/w;  2048w;  16  b/w); 

Pc(M.processor^state:  ^2w;  1  instruction/w;  1  address/ 
instruction);  1948  ~  1966) 
C('LGP-.30;  technology:  vacuum  tubes;  power:  1.500  watts; 
Mp(drum,  4096  w;  31  b/w;  t.access:  .260  ~  16.6ms); 
Pc(l  address/instruction;  1  instruction/word;  Mps:  ^2w)) 

12. .3    compound-computer  :  =  simple-computer( 

structure:  ((1  Pc,  n  Pio)|(l  Pc,  n  Pio,  P.display) |  (2  Pc)|(n  Pc  multi- 
processor) I  (n  Pc,  P( array)  |  (n  Pc,  special  algorithm)  |  (n  Pc  parallel 
processor)); 

subcomponents:  Pc-set,  Mp-set,  'controlled:  component-set(Pc-set)) 

The  essential  feature  of  compound  computers  is  to  have  more  than  one 
processor.  This  is  indicated  primarily  by  the  structure  parameter  but  re- 


quires augmenting  the  subcomponents  to  include  a  set  of  Pc's.  Other  than 
this,  compound-C's  are  the  same  as  simple-C's,  although  some  parameters 
(such  as  instruction-type)  may  not  have  simple  values  if  several  Pe  s  differ 
radically. 

The  simpler  compound-C's  retain  a  single  Pc,  but  add  input/output 
processors  (Pio's  and  then  P. display's).  The  next  step  is  to  limited  multi- 
processing, with  2  Pc's,  and  on  to  n  Pc's  operating  on  many  programs,  and 
finally  to  parallel  processing  operation  on  many  tasks  of  a  single  program. 
A  parallel  processor  is  distinguished  from  a  network;  namely,  there  is  no 
way  to  decompose  a  parallel  processor  into  disjoint  C's  (with  Pe  s  and 
Mp's).  In  both  multiprocessing  and  parallel  processing  there  may  or  may 
not  be  Pio's,  P.display's,  and  other  special-function  processors. 

EX.4MPLES    C(l  Pc-8  Pio;  'IBM  7094  II;  Mp  (.32768  w;  I.4jus/w;  36  b/w); 

Pc(l  address;  1  instmction  /  w;  Mprocessor  state:  12  w;  data- 
types: (integer,  word,  bv,  sf,  suf,  df,  duf,  fr.i);  1962  ~  1966) 
C(multiprocessor;  'Burroughs  D-825;  Mp(65  kw;  4.8jtis/w;  48 
b/w);  16  (Pc,  Kio);  Pc(stack;  12  b/syllable;  1  ~  7  syllable  / 
instruction;  data-types:  integer,  floating,  single  character, 
boolean  vector)) 

12.4    network/N  :  =  dual-C  |  network-C  |  C-set. 

A  network  is  any  collection  of  two  (dual-C)  or  more  computers  not  inter- 
connected through  primary  memory.  The  network-C  is  a  special  case  of  a 
single  physical  structure  which  is  usually  called  a  single  C  but  by  its 
structure  is  a  network  (for  example,  CDC  6600).  Finally,  a  set  of  inter- 
connected computers  that  are  physically  separate  are  the  most  general 
case  of  networks. 

ISP  conventions 

Making  use  of  the  prior  general  conventions  and  the  PMS  definitions,  ISP 
is  developed  systematically.  We  do  this  only  for  the  processor  and  not  for 
controls  (though  the  system  might  be  adapted  to  that  end).  Several  nota- 
tions are  added  to  make  ISP  conform  with  currently  existing  notations. 

The  top-level  entities  of  ISP — data-types,  operations,  the  interpreter, 
and  the  instniction-set — are  values  of  corresponding  attributes  in  the  PMS 
definition  of  a  processor.  An  image  of  all  the  PMS  structure  for  a  computer 
system  exists  in  the  instruction  set  of  the  processors  that  control  the  PMS 
components.  PMS  notation  is  assumed  for  this.  In  ISP  the  primary  mem- 
ory (Mp)  is  usually  named  M;  all  other  memories  must  be  specifically 
declared  and  named. 

1  Data-types 

2  Instructions 

3  Operations 

4  Processors 


Appendix  629 


1.  Data-types 

1.1  We  give  first  a  general  definition  of  data-types  (1.2),  and  then  two 
shorter  notations,  which  are  the  ones  commonly  used — i-units  (1.3)  and 
data-type-names  (1.4). 


1.2    data-type  :=  ( 
referent:  entity; 
referent-expression; 
*component-list; 
component:  data-type; 
carrier:  i-unit; 

format:  (component:  meniorv-expression)-list; 
information-content:  |i|  ) 

A  data-tvpe  specifies  the  encoding  of  a  meaning  into  an  information  me- 
dium. The  meaning  of  the  data-type  (that  which  it  designates  or  refers  to) 
is  called  its  referent  (or  value).  The  referent  may  be  an  entity,  ranging 
from  highly  abstract  (the  uninterpreted  bit)  to  highly  concrete  (the  pay- 
roll account  for  a  specific  type  of  employee).  The  encoding  of  this  refer- 
ent either  is  directly  understood  (as  when  a  bit  encodes  a  bit)  or  must  be 
given  bv  the  referent  expression  in  terms  of  the  component  data-types. 

E.VAMPLE    binary-floating-point-number  :  =  data-t\  pe( 
referent:  number; 

component-list:  mantissa,  exponent; 
referent-expre.ssion:  numtis.sa  X  2  f  exponent) 

COMMENT  Note  that  in  the  referent  expression  the  component  data-types 
are  taken  to  designate  their  values,  i.e.,  a  signed  fraction  and  an  exponent 
is  an  integer.  This  avoids  a  clumsier  notation  in  which  one  could  write: 

referent(mantissa)  X  2f  referent(exponent). 

Associated  with  everv  data-type  is  an  i-unit,  called  its  carrier,  into 
which  all  its  component  data-types  can  be  mapped.  The  carrier  is  used  in 
storing  the  data-type  in  memories  and  in  transmitting  it  over  links.  It  must 
be  extensive  enough  to  hold  all  the  component  data-types,  but  it  may  be 
larger  (having  error-checking  and  -correcting  bits,  or  even  unused  bits). 
It  need  not  hold  disjointly  all  the  carriers  of  the  component  data-types, 
since  packing  may  occur.  However,  the  component  data-types  must  all 
have  their  relative  structures  preserved  (or  they  cannot  be  processed).  The 
mapping  of  the  component  data-types  into  the  carrier  is  called  the  format. 
It  is  given  as  a  list  that  associates  to  each  component  a  memory  expression 
involving  the  carrier  (see  ISP  2  for  definition  of  memory-expression). 


EXAMPLE  floating-point-number  :  =  data-type  ( 
component-list:  manti.ssa,  exponent; 
mantissa  :  =  2.3  b;  exponent  :  =  9  b; 
carrier:  word,  .32  b/w; 

format:(mantissa:  word<():22),  exponent:  vvord<23:31>  )  ) 

The  five  parameters — referent,  referent-expression,  component-list, 
carrier,  and  format — determine  a  data-type.  The  information  content  is 
simply  a  useful  redundant  parameter,  which  gives  the  amount  of  variety 
of  the  data-type.  An  upper  bound,  of  course,  is  the  amount  of  information 
in  the  carrier.  A  better  estimate  is  the  sum  of  the  contents  of  the  compo- 
nent data-types.  A  true  value  must  take  into  account  the  dependencies 
between  components.  The  efficiency  of  encoding  (under  the  constraint  that 
the  encoding  must  be  into  the  carrier  and  that  all  possible  values  must  be 
represented,  no  matter  how  low  their  probability  of  occurrence)  is  the 
ratio  of  the  information  content  to  the  carrier  content. 

1..3    data-type  :=  i-unit 

The  simplest  data-types  are  i-units.  An  i-unit  as  a  data-type  implicitly 
determines  the  five  defining  parameters  given  in  ISP  1.2.  The  referent  is 
the  uninterpreted  i-unit  itself  (i.e.,  a  word  is  to  be  handled  only  as  an 
uninterpreted  unit  of  information).  There  is  no  need  for  a  referent  expres- 
sion. The  carrier  is  the  i-unit  itself,  if  it  is  an  i-unit  capable  of  independent 
storage  and  transmission  in  the  system.  If  not,  then  the  carrier  is  the 
smallest  such  i-imit  that  contains  the  given  i-imit.  The  component  data- 
types are  the  first  sublevel  of  stnictures  of  the  i-unit.  Tliere  are  no  com- 
ponents if  the  i-unit  is  a  base-unit  (bit  or  undecomposable  character).  If 
the  i-unit  is  the  carrier,  no  format  is  needed.  If  a  larger  carrier  is  required, 
then  a  mapping  is  usually  implicit  (e.g.,  1  bit  in  a  word  goes  into  the  low- 
order  position;  1  word  in  a  block  goes  into  the  first  word,  etc.).  If  not,  a 
format  must  then  be  given  in  the  regular  way. 

1.4    data-type  :=  data-type-name 

data-type-name  ;  =  i-unit-name  |  simple-name  | 

component-name  .  length-type  |  precision  .  data-type-name  | 
component  .  component  .  .  . 

length-type  :  =  array  /  a  |  string  ,'  st  |  vector  /  v 

precision  ;  =  -f  integer  |  multiple  /  m  |  quadruple  /  q  |  triple  /  1 1 
double  /  d  |  "single  /  s  |  half  /  h  |  fractional  /  fr 

A  naming  scheme  is  provided  for  data-types,  which  can  be  used  as  a  basis 
for  abbreviations.  Some  data-types  have  arbitrary  simple  names  (e.g.,  char- 
acter, floating  point  numbers);  others  are  named  by  their  value  (e.g.,  in- 
teger). Data-tvpes  that  are  iterations  of  a  basic  component  can  be  named 
by  the  component  suffixed  by  a  length-type.  The  length-type  can  be  array/ 
a,  implving  a  multidimensional  arrav  of  fixed  but  unspecified  dimensions; 
a  string/st,  implying  a  single  sequence  of  variable  length  (on  each  occur- 


630  Appendix 


rence)  or  a  veclor/v,  implying  a  one-dimensional  array  of  a  fixed  but  im- 
specified  number  of  components.  The  length-type  need  not  exist,  and  then 
this  form  of  the  name  is  not  applicable. 

Data-types  are  often  of  a  given  precision,  especially  when  referring  to 
numbers;  it  has  become  customary  to  measure  this  in  terms  of  the  number 
of  components  that  are  used,  e.g.,  triple-precision  integers.  Names  can  be 
formed  from  the  basic  data-type-name  by  prefixing  the  precision.  Note 
that  a  double-precision  integer,  while  taking  two  words,  is  not  the  same 
thing  as  a  two-integer  vector;  so  that  the  precision  and  the  length-type, 
although  both  implying  something  about  the  size  of  the  carrier,  do  not 
express  the  same  thing.  Finally,  it  is  possible  to  name  a  data-type  by  simply 
listing  its  components. 

The  main  use  of  the  data-type-name  is  to  permit  the  short  abbrevia- 
tions which  arise  by  replacing  every  part  with  its  abbreviation  and  drop- 
ping the  periods.  Thus,  double-precision  integers  have  the  data-type-name 
of  double. integer,  which  can  be  replaced  by  d.i  and  then  by  di.  Similarly, 
a  vector  of  bits  is  bit. vector  /  b.v  /  bv.  [The  definition  of  data-type-name 
is  consistent  in  its  use  of  period  with  the  definition  of  compound  name 
(see  GC  10)]. 

If  a  data-type  is  defined  by  giving  just  its  name,  conventions  are  re- 
quired to  define  the  five  parameters  of  the  data-type.  The  carrier  is  always 
taken  to  be  the  smallest  i-unit  that  can  contain  the  data-type  with  the  fol- 
lowing mapping.  The  format  is  taken  to  imply  that  the  components  are 
laid  out  in  order  (with  no  packing)  into  the  subcomponents  of  the  carrier 
i-unit.  The  referent  of  the  data-type  is  given  by  context,  e.g.,  if  the  data- 
type is  simply  an  iteration  of  some  kind  of  a  data-type  whose  value  is  al- 
ready understood,  (e.g.,  in  a  vector  of  integers).  Thus,  there  is  no  need  for 
a  referent  expression. 

1.5  We  give  below  a  number  of  basic  data-types  that  need  to  be  defined 
explicitly.  Table  .3  summarizes  a  large  number  of  data-types  and  gives 
their  standard  abbreviations,  as  above.  Figure  .3  of  Chap.  2  shows  the 
lattice  of  data-types  in  which  one  data-type  is  connected  to  a  higher  one 
if  it  can  be  obtained  by  a  hirther  specification  of  the  higher  one.  This  is 
significant,  since  operations  on  higher  data-types  also  apply  to  the  lower 
ones.  In  the  definitions  below,  which  are  the  standard  general  data-types, 
we  omit  the  referent  expressions,  carriers,  and  formats  except  those  that 
are  simple.  (The  fully  general  definition  of  radix-complement  number 
representation,  for  example,  is  too  extensive  to  be  worthwhile  here.) 

base-data-type  /  radix  :=  data-type(referent:  (binary  /  2 1  octal  /  8| 
decimal  /  10 1 hexidecimal  /  16);  component:  i-unit:  (b |  o | d| he.\) ) 

-I-  integer-data-type  /  ui  /  unsigned-integer  /  magnitude  :  =  data-type 
(referent:  -|- integer;  component:  radix) 

integer-data-type  /  i :  =  sign-magnitude  |  radix-complement  | 
(radix  —  l)-complement 

number-data-tvpe  :  =  data-type(referent:  number;  normalization: 
('normalized  /  n  |  unnormalized  /  u);  name:  normalization  .  nimiber- 
data-type-name) 


Table  3  Examples  of  commonly  used  data-types  (organized  by  basic 
i-units) 


bit    boolean  /  b 
bit. array  /  ba 
bit.vector  /  bv 

byte  /  by 
byte. string  by.st 
10  byte. vector    10  by.v 

ctiaracter    char  /  ch 
char. string  /  char.st 
10  char  /  10  ch 
4  char.vector    4  ch.v 

complex  cx 

digit  d 

10  digits  10  d 
digit  vector  /  d.v 
10  digit,  array  ,'  10  d.a 

floating  point  ,  f     single  floating  point  /  sf 
unnormalized  floating  point  ,  uf 
double  floating  point  /  df 
double  unnormalized  floating  point  duf 
floating  point  vector  /  s.f.v  /  f.v 

field 

fraction  /  fr 

integer  i 
integer  vector  /  iv 
double  integer  di 

mixed  /  mx 

word  /  w 

half  word  ,'  hw 
double  word  dw 
triple  word  tw 
multiple  word  mw 
word  vector  ,  wv 
word  string  ,  w.string 
half  word  vector  /  hw.v 

7  word  ,'  7  w 

8  word  vector  /  8  w.v 


COMMENT  The  general  data-type  for  number  introduces  a  new  parameter 
(normalization)  to  prefix  the  name  of  all  numbers. 

mixed  /  mx  /  fixed-point  :  =  number-data-type  (components:  integer- 
part,  fractional-part) 

floating-point  /  f :  =  number-data-type( components:  mantissa,  expo- 
nent; value-expression:  mantissa  X  radix  f  exponent) 


Appendix  631 


complex  :=  data-type(components:  real,  imaginary:  iisiialh/  floating 
complex) 

field        :=  data-type(carrier:  word;  components:  i-unit-list:  format: 
(element-range)) 

COMMENT  A  field  is  a  subset  of  bits,  or  characters,  or  b\'tes  in  a  word.  It 
is  usually,  though  not  always,  an  interval.  See  ISP. 2  for  element  range. 

E-V-iMPLES 

12.  101.  .5:  +12.5,  —126:    unsigned;  unci  sinned  integers 
+  72.  —999;  sign-magnitude 
lOl,,  77g,  A9,g;  binary,  octal  and  hexidecimal 

+  6.257;  6.257  X  10";         mixed,  and  floating  point 
(1,  2,  2.7);  complex 

l<t>.,:  7<tj  digit  set  specification;  stands  for 

lO^JlUand  T0^\71^\  .  .  .  \77^ 
respectivclij 

?  <incsti<in(ddc  value 

2.  Instruction 

2.1    instruction  :  =  data-tvpe( referent:  instruction-expression;  operation- 
code:  field;  operand-list:  operand:  data-type) 

instruction-expression  :=  condition—*  action-sequence 

action-sequence  :  =  (step (next  step)-list 

step  :=  action  I  condition  — >  action-sequence 

action  :  =  memory-expression  <—  data-expression 

memory-expression  :  =  ( 

memory  *[address-range)-list  '(element-range)  character-base] 
memory-expression  □  memory-e.xpression  |  meniorv-expression-list) 

address-range  :  =  address] address:  address ] address-e.xpression | 
address-range-list 

address-expression  :=  operation-expression(address-operations) 

element-range  :  =  field  ]  field-list 

,^        character-base  :=  +  integer       /wise  i-unit 

condition  :  =  boolean  |  menior\  -expression 

data-expression  :  =  data-type  |  memory-expression  ] 
operation-expression  ]  data-expression(data-type) 

operation-expression  :  =  (nonar\-operation  ] 
unary-operation  data-expression  1 
data-expression  binary-operation  data-expression  ] 


data-expression  n-ary  operation  data-expression  .  .  .  | 
function(  data-expression-list)  /  f( data-expression-list)  ] 
operat ion-expression  * { operation-modifier ) 

operation-modifier  :=  data-type  |  name        See  CC  10 

2.2  The  instruction  is  a  data-type  and  thus  has  both  a  representation  in 
memory  and  a  referent,  which  is  called  the  instruction-expression.  The 
only  fixed  part  of  the  instruction  format  is  the  operation-code.  All  the  rest 
are  operands  to  be  used  by  the  instruction-expression. 

2. .3  The  instruction-expression,  when  interpreted,  takes  the  processor 
through  a  sequence  of  steps  which  result  (possibly)  in  some  change  of  state 
of  the  computer  system  that  holds  past  the  period  of  interpretation,  thus 
constituting  a  new  initial  condition  for  the  next  instruction.  The  action 
sequence  has  two  structural  features.  First,  steps  (and  subsequences  of  steps) 
may  be  conditional  on  a  boolean  value,  developed  according  to  a  condi- 
tion. Second,  steps  may  be  accomplished  in  parallel  or  in  series.  Any  set 
of  steps  between  two  occurrences  of  the  term  "next,"  are  to  have  all  their 
data  expressions  developed  prior  to  any  transmission  of  data.  Thus,  all 
their  data  is  a  fimction  of  the  existing  state  at  the  start  of  the  sequence. 
At  the  occurrence  of  the  term  "next,"  all  pending  transmissions  are  made, 
so  that  the  state  for  the  following  sequence  of  steps  is  now  different  (if 
there  were  in  fact  transmissions  to  be  made). 

2.4  .\\\  permanent  changes  in  state  are  accomplished  by  means  of  actions, 
which  take  data  developed  according  to  a  data  expression  and  transmit  it 
for  storage  in  a  memory,  as  designated  by  a  memory  expression. 

EX-\MPLES 

A^B:    B^D-G:    B^B  +  g 

xl  <  1  ,'(2;        .x2  «  xl;        a  <—  abs(a);  a  <—  normalize(b) 

AB<-anb 

xl  (float)  <^  .x2  (fixed)  fixed  to  floating  data-type 

xl  <— xl  +  x2  (floating)  floating  data 


a  <—  a  X  2°  (logical) 

AC,  MQ  ^  .A.Cn  MQ  /  M[z] 

ACa  MG  ^  AC  X  M[z] 

A  ^  6777 

G  ^  f(A,  B.  C) 

A  ^u  B 

A  ^  B  b  C 


u.mally  called  logical  shift,  actu- 
ally a  boolean  vector  operation 


nonary  operation 
general  function 
general  unary  operation 
general  binary  operation 


-  max(a,  B,  .\YZ,  E,  4)    n-ary  operation 


632  Appendix 


2.5  The  memory  expression  specifies  the  contents  of  a  memory  (an  in- 
stance of  a  data-type)  by  giving  the  memory  switch  (possibly  compound), 
as  seen  from  PMS.  However,  all  that  is  represented  in  ISP  is  the  address 
that  is  used  to  control  the  switch.  The  address  is  a  data-type,  usually  rep- 
resented as  a  positive  integer.  The  element-range  is  a  field.  In  both  cases 
it  is  possible  to  specify  an  arbitrary  list  of  contents  (addresses  and  fields), 
although  in  most  processors  this  can  never  arise.  The  address-range  x:y 
means  from  address  x  to  address  y  inclusive. 


E-XAMPLES  OF  REGISTERS 


A,  or  A; 


identical  names 
ternary  memory 
scalar  bits  of  an  array 
identical  registers 
38  bit  register 
identical  registers 
identical  vectors 
16x16  matrix 
3  dimensional  array 


boolean-memories; 
scalar  bits 

sign  bit/sign„bit/sb 

lb;  b2;  2C1;  2C2';  C";  C  ";  "A" 
end^around^ shift  or  end  around  shift 

i<2>;  Z<a> 

bc<12:8>  or  bc<12,  11,  10,  9,  8) 
AC<P,Q,S,1:35> 

X<0:7>8  or  X<0:23>2  or  X<0:2.3> 
M[0:77778]<0:11>  or  M[0:4095]<0:3>8 
X[0:15][n:1.5]<31:0> 
M[n:7][0:31][0:127]<0:n> 

EXAMPLES  OF  RESTRUCTURING  AND  RENAMING 

A<17>  :=  B<4>;  A<0:1>  :=  B<0,  4> 

op<0:2>  :=  i[l]<9:ll> 

A[0:.3]<0:7>  :=  A'<0:31> 

indicator!  UCXinOl,]  :=  sense_switch<A> 

XR[1:2][1:.3]  <  B,  A,  8,  4,  2,  1>  :=  M[87:89, 
92:94]<B,  A,  8,  4,  2,  1) 

EXAMPLES  OF  REGISTERS  FORMED  BY  CONCATENATION 

LAC<L,  0:11)  :=  La.\C<0:ll> 
AB<n:47>  :=  A<0:2.3>n  B<0:23> 

EXAMPLES  OF  REGISTERS  FORMED  BY  A  LIST  OF  REGISTERS 

C,  D<0:4>  :=  B<7>,  A<l;4>nZ<8> 

2.6  An  address-expression  is  an  operation-expression  on  addresses,  i.e., 
using  only  the  address-operations  available  in  the  processor.  An  address- 


vectors  formed  from 
single  bit  vector 


expression  may  imply  the  use  of  memory  if  it  involves  nested  parentheses; 
such  memory  is  assumed  to  be  temporary  with  no  permanent  effect  on 
the  memory  state. 

2.7  A  condition  is  given  as  a  boolean,  that  is,  as  either  true  or  false 
(equivalently,  1  or  0),  or  the  result  of  a  boolean  expression  involving  the 
logical  connectives  or  relations  among  data-expressions  (see  Table  4,  ISP  3, 
and  also  GC  13).  A  condition  can  also  be  given  as  a  memory-expression, 
in  which  case  the  memory  contents  are  normally  evaluated  as  a  boolean 
vector  with  all  Os  being  false,  and  not  all  Os  being  true. 

2.8  Data-expressions  are  either  instances  of  data-types;  the  contents  of  a 
memory,  as  given  by  a  memory-expression;  or  the  results  of  operation- 
expressions,  which  is  to  say,  the  results  of  operating  on  data-types  by  the 
data-operations  available  in  the  processor.  Data-expressions  may  imply  the 
use  of  memory  if  they  involve  nested  parentheses.  Such  memory  is  as- 
sumed to  be  temporary,  with  no  permanent  effects  on  the  memory  state 
of  the  processor  or  memory.  The  data-type  name  may  sometimes  follow 
the  data-expression,  (data-type),  in  order  to  carry  more  information  and 
avoid  more  complex  names  for  memory-expressions,  etc.  (see  Chap.  2, 
page  .30,  and  ISP  3.1). 

2.9  Operation-expressions  are  the  form  used  by  the  operations  (see  ISP  3). 
Note  that  the  operation-expression  as  a  whole  can  be  modified  by  an 
operation  modifier  enclosed  in  braces. 


EXAMPLES  OF  INSTRUCTIONS 

add  (;=  op  =  101)^ 

(L3  AC^Ln  AC  -I-  M[z]) 

jms  (:=  op  =  100)^  (M[z]  ^  PC;  next 
PC  ^  z  -I-  I) 

FAD  (:=  op  =  -1-767)^ 
(FAC  .-FAC  -I-  M[z]  {s.f}) 


integer  add 

lump  to  subroutine 

single  precision 
floating  point  add 

add^  (A  «— ,\  +  M[z]  {two's  complement))     the  operation  code 

need  not  be  given 


skip  (:  =  op  =  67)  ^  ((A  >  0)  ^  P  . 

(A  =  0)  ^  P  . 


■P  -I-  2; 
■P-h  1) 


add /"A"  (:=  op  =  IIOOOI)^ 

(Ov,  M[B]  ^  M[B]  -I-  M[A]  {string}) 
"B"  (;=  op  =  l)^(A^M[t][s]) 

((A  A  B)  V  (C>  F))  ^  (G  ^  G  -I-  H) 


3.  Operations 

3.1  Operations  are  defined  to  produce  results  of  specific  data-types  from 
operands  of  specific  data-types.  The  data-types  themselves  determine  by 
and  large  the  possible  operations  that  apply  to  them.  No  attempt  will  be 
made  to  define  the  various  operations  here,  as  they  are  all  familiar.  Table 
4  gives  the  notation  for  the  operation-types,  organized  by  data-types.  In 


Table  4  Data-operations 


operation-types  :  =  access  l-unit  operations |  transmission-operation  |  control-operations  |  unary-arithmetic-operations |  binary-arithmetic-operatlons | 
n-ary-arithmetic-operations  |  conversion-arithmetic-operations  |  unary-vector-operations  |  relational-i-unit-operations  | 
relational-arithmetic-operations  [  boolean-operations 

nonary-operation  :=  memory-expression 

unary-operation  '  u  :  =  unary-arithmetic-operations  |  unary-boolean-operation  see  GC  13 
binary-operation  /  b  :=  binary  arithmetic-operations  |  binary-boolean-operations  sec  GC  13 
n-ary-operation  :=  n-ary  arithmetic-operations|n-ary-boolean-operations  see  GC  13 


Operation 


Abbreviation  Result^ 


Operatio7i  Fomi' 


Comments 


accessiunit-operations 

read 
write 

vector  element  write 
vector  element 
concatenation 
extraction 

transmission-operation 

transmit 

control-operations 
evoke 


next 

unary-arithmeticoperations 
absolute  value  or 

magnitude 
negate 
reciprocal 
integer  part 
fraction  part 
sign 
round 

normalize,  mantissa  part 
normalize  exponent, 

exponent  part 
square  root 
square 
logarithms 
exponential 
trigonometric 


random  (parameter 
for  particular 
distributions) 

arithmetic  shift 
of  radix,  r 


abs 


1  / 


sqrt 

(  y 

log,  In 
e 

trigfcn 


t.V,[i2] 

ti  :j 

tj  (element-range) 


b,  — ►  action-sequence 


abs(ni) 


integer_part(ni) 

frp(ni) 

sgn(ni) 

round(ni,n2) 

normalize(ni) 

normalize^ 

exponent(ni) 
sqrt(ni) 

log,io(ni) 

gn, 

trigfcn(ni) 


random(ni) 


n,  X  r" 

n,  /  r" 


basic  operation  is  to  access  an  i-unit 

in  a  memory  (e.g.,  word  vector) 
access  tj  for  reading 
access  t^  for  writing 
the  i^th  element  of  vectorj  is  read 
the  ijth  element  of  vector,  is  written 
t,  and  tj  are  combined  to  form  tj 
some  part  of  t,  forms  t^ 


t2  receives  1-unit  of  t,;  involves  read 
transmit  and  write 


if  b,  is  true  then  action-sequence  is 
applied;  else  the  action-sequence  is 
ignored 

the  occurrence  of  "next"  implies 
operations  following  occur  later 

nj  may  be  unsigned  data-type 


n^  IS  an  integer  data-type 

n,  may  be  mixed|f|unf 

nj  may  not  be  ui  |  ufr 

used  with  multiply,  divide 

used  with  t  arithmetic 

to  fix  numbers  into  a  standard  form 

(n,  >  0) 

logio(ni) 
log,.(ni) 

also  sin,  sin  sinh,  etc,  for  the 
separate  trigonometric  function 
(both  radians  and  degrees) 

n,  may  be  previous  pseudo-random 
number  (seed) 

if  ij  is  signed,  then  either  form  can 
be  used  for  both  x  and 


'Results  and  operations  forms  given  in  terms  of  data-types  to  which  they  apply:  b— booleans;  i— integers;  f— floating;  n— any  numeric  data-type  (e.g.,  floating, 
integer,  mixed);  t — all  data-types;  v— vectors. 


Table  4    Data-operations  (Continued) 


Operation 


Abbreviation  Result 


Operation  Form 


Comments 


binary-arithmetlc-operations 
add 

subtract 
inverse  subtract 
multiply 
divide 

inverse  divide 
modulo 

conversion-arithmetic-operations 

fixto-float 
float-to-fix 

unary-vector-operations 


end-around-shift  (rotate) 

logical-shift 

tally/count 
sign  extend 

n-ary-arithmetic-operations 
minimum 
maximum 
summation 
average 
product 

relational-i-unit-operations 
identical 
not  identical 

relational-arithmeticoperations 
equality 
inequality 
less  than 
greater  than 
less  than  or  equal  to 
greater  than  or  equal  to 

boolean-operations 

false  (0) 
and 

null 

null 

exclusive  or; 

inclusive  or 
nor/Pierce  stroke 
coincidence  or 


mod 


mm 
max 
sum 
avg 
prod 


< 
> 
< 
> 


0 

A 


e 

V 


nj  or  nj 

n,, 


Vj  X  r  {rotate} 
Vj  /  r  {rotate} 
V3  X  r  {logical} 
V3  /  r  {logical} 


'm-H 


"l  +  "2 
"l  -  "2 

n,  X  n^ 
Hi,  n^  /  nj 


i,  mod  i. 


float(ii) 
fix(fi) 


V[  X  r'-{rotate} 
Vj  r'-{  rotate) 
Vi  X  r'-{ logical} 
Vi  X  r'-{logical} 
tally(b.v) 

sign„extend(b.Vi) 

min(nj,  n^  r 

max(ni,  n2  1 

sum(ni,  nj,  .  . 
avg(ni,  nj,  .  . 
prod(ni,  n^,  . 

d,=  d, 


ni  <  nj 

"i  >  "2 

"l  <  "2 

n,  >  n. 


0 

bi  A  bj 
bi  A  nb2 
b, 

nbi  A  b, 
b, 

n(bi  =  bj)  =  ((b,  A  ^b^)  V 

(-,b,  A  bj))  =  bi  @  b^ 
bi  V 

— |bi  A  -|b2  or  — |(bi  V  b2) 
bi  =  bj  or  -|(bi  ®  bj)  or 
(b,  A  bj)  V  (-|b,  A  -ibj)) 


n,„) 

n„,) 
n,„) 
nj 


where  only  nj  or      may  be  used  to 

give  or 
see  divide  -similar  to  inverse  subtract 
ii  —  (ii  /  12)  X  ij  remainder 


integer  or  fixed  to  floating 
floating  number  to  integer 

radix  r;  note  if  r  =  2,  the  character 
is  a  bit 


the  most  or  least  significant  digits 

receive  O's  in  the  shift 
count  I's  in  a  vector 
copy  sign  of  b.v  to  fill  vector  in  n^ 

smallest  of  nj  .  .  .  n^^^ 
largest  of  ni  .  .  .  n,^ 
n,  +  n^  .  .  .  +  n„, 
n,  +  n^  .  .  .  n„)  /  m 
ni  X  nj  .  .  .  X  n,^ 

comparison  of  two  i-units 


comparison  of  two  numbers 


all  16  possibilities  are  listed 


Appendix  635 


Table  4    Data-operations  (Continued) 


- 

Abbreviation 

Result 

Operation  Form  Cotmnents 

not 

bs 

nbj 

implication-Inverse 

bi  ^  -ib2 

not 

-1 

-lb, 

implication 

D 

-|bi  V  b2  or  bj  D  b2 

nand  Sheffer  stroke 

T 

b, 

-,b,  V -,b2  or -|(bi  A  b^) 

true  (1) 

1 
1 

b3 

1 

boolean-operations  (common  set) 

not 

1 

b3 

nbi 

and 

A 

b3 

bi  A  bj 

or 

V 

b, 

"3 

b.  V  b. 

exclusive  or 

© 

b3 

bi®  bj 

boolean-operations  (sufficient  sets) 

nand 

T 

b3 

-,(bi  A  b^) 

nor 

I 

b3 

-i(bi  V  bj) 

/not 

b, 

— ibi      1  this  pair  of  operations  are  required 

I  and 

~i 

A 

b3 

bi  A  bj  1  for  sufficient  set 

order  to  have  an  open-ended  scheme  for  operating  on  many  data-types 
and  defining  new  operators,  the  operation  modifier  is  used.  The  operation 
modifier  enclosed  in  braces  is  used  to  distinguish  operations  from  one 
another.  The  operation  modifier  is  usually  the  name  of  a  data-type,  but  it 
can  also  be  a  descriptive  name  applying  to  the  operation  (e.g.,  rotate). 
For  example,  the  various  add  operations  on  differing  data-types  are  speci- 
fied bv  writing  (data-type)  after  the  operation  (see  Chap.  2.  page  .30). 

.3.2  Operations  can  be  defined  for  the  most  inclusive  data-types  for  which 
thev  will  work  and  can  then  be  applied  to  more  specific  data-types.  The 
most  general  instance  of  this  is  the  transmit  operations  which  works  on 
i-units,  and  is  therefore  used  for  all  specific  data-types,  such  as  numbers 
(because  it  works  on  their  carriers),  .\nother  example  is  the  relational 
operations  of  equality  and  inequality. 

3.3  New  operations  can  be  defined  b\  means  of  forms  (see  GC  4..5).  We 
simply  give  some  examples. 


EXAMPLES 


.\1  -I-  .X2  :  =  (  .\1  -I-  .X2; 

(  XI  -I-  X2  >  21- 


)     (Ov  ^  D) 


Xl<ll:n>  :=  X2  X  2  {rotate}  :  = 
X1<11:I>  :=  X2<in:0>; 
X1<0)  :=  .\2<11>  ) 


two's  complement  add 
side  effect,  set  Ov 

rotate  operation;  end 
bits,  X<ii>  and  X{0}, 
are  connected 


4.  Processors 

4.1  The  ISP  definition  of  a  processor  consists  of  a  set  of  instructions,  which 
involve  a  set  of  operations,  data-tvpes,  memories,  and  other  PMS  compo- 
nents, plus  an  interpreter  that  finds  the  next  instruction  and  executes  it. 
These  sets  are  all  values  of  corresponding  attributes  of  the  PMS  descrip- 
tion of  a  processor.  .\11  these  aspects  of  an  ISP  processor  have  to  be  de- 
clared in  giving  the  description.  In  practice,  some  of  them  are  given  by 
having  the  PMS  description  available  (e.g.,  word  size,  T's,  Ms's,  etc.); 
others  declare  themselves  simply  by  occurring  in  the  ISP  expressions  (e.g., 
most  of  the  operations  and  data-types).  We  list  below  the  common  form 
of  the  machine  ISP  descriptions  as  a  reader  will  find  them  in  the  chapter 
appendices  of  this  book. 

4.2  Memory  (.Mps,  Mp  and  M(T. console)).  The  processor  state  memory 
is  declared  first.  It  holds  the  information  necessary  to  restart  the  proces- 
sor, if  it  is  stopped  between  instructions.  Table  1  (page  621)  names  the 
fimctions  of  the  memory  (e.g.,  program  counter,  accumulators,  etc.).  The 
state  also  includes  the  internipt  status,  machine  fault  bits.  etc.  Any  memory- 
mapping  hardware  registers  are  considered  part  of  this  state. 

The  primary  memory,  the  largest  state,  is  used  to  hold  the  program 
that  the  processor  interprets.  It  also  holds  data. 

The  console  state  is  accessible  from  the  operator's  console.  Only  the 
bits  that  are  part  of  the  ISP  are  relevant,  i.e.,  bits  that  can  be  used  to 
change  the  state  of  the  primary  memory  or  processor  state.  The  switches 
that  are  used  to  start  and  stop  the  machine  should  also  be  given  in  a 
complete  definition. 


636  Appendix 


4.3  Instruction  Format.  The  instruction  formats  are  usually  declared  in 
the  same  fashion  as  memory  and  are  not  distinguishable  as  special  non- 
memory  entities.  Normally,  the  instructions  are  carried  in  registers;  it  is 
thus  natural  to  give  declarations  in  this  fashion.  Usually  only  a  single  dec- 
laration is  made,  the  instruction/i,  followed  by  the  declarations  of  the 
parts  of  the  instniction — the  operation  code,  the  address  fields,  indirect 
bit,  etc. 


i/instruction[0:4]<0:7)  five  H  hit  byte  instruction 

op<0:4>  :  =  i[0]<():4>  opcode 

r<0:2>  :=  i[0]<5:7>  register  address 

d<():1.5>  :=  i[l:2]<():7>  16  bit  address 

4.4  Effective  Address  Calculation  Process.  This  process  is  declared  using 
the  assignment  command  (;  =  )  and  is  evoked  each  time  an  in.struction 
makes  reference  to  a  variable  that  is  taken  to  be  an  effective  address  or  an 
operand.  In  the  book  operands  have  two  forms.  Most  of  the  time  they  are 
expressed  as  memories  and  address  expressions  using  the  effect  address 
calculation  process;  otherwise  the  operands  are  defined  by  a  process. 

EXAMPLES 


Conditional  register  definition 
z<0:ll>  ;=  (-|i^  z'; 

i^  (M[z']  -I-  1; 

M[z']  ^M[z']  -I-  1)) 

G;  =  M[g] 

shift^count  /  SC<0:25>  :  = 
(-,F^e';  F-^  z) 

E'<21;.35>:  = 
,A  ((T  =  0)  ^^(f      0)      XR[T]  -I-  y) 

Declarations  in  terms  of  a  variable  parameter 
Mp[z]  :=  ((z  >  FL)^  Mp[2  -I-  n.\]; 

(z  >  FL)  ^  (Run  ^  0; 
violation  <—  1)) 

Evaluated  expressions 

add^instniction  :  =  (op  =  .5) 


effective  address 
with  side  effects 

operand  definition 
process 


index  convention 


onlij  side  effects, 
no  value 


boolean 

z<0:6>  :=  (a<():5,7>  -I-  b<l:7>)  7  bit  value 

skip„ condition  :=  (-|Q  A  d<15>  V  z<6» 

4.5  Data-type  Format  and  Special  Data-Operation  Definitions.  The  com- 
ponent parts  of  the  data-types  are  named,  and  their  element  ranges  are 


first  defined,  so  that  the  data-operation  definitions  can  use  them.  For  ex- 
ample, a  precise  definition  of  an  ISP  would  include  the  data-type  formats 
(for  example,  floating-point),  followed  by  a  definition  of  each  data  opera- 
tion (for  example,  -I- ,  — ,  X,  /)■  Normally,  we  do  not  give  enough  infor- 
mation about  the  data-type  and  its  appropriate  operation  implementation 
in  our  description  of  machines,  since  the  information  for  these  descriptions 
is  obtained  from  the  programming  manuals.  If  we  were  actually  to  use  the 
ISP  descriptions,  as  an  interpreter  using  a  compiled  or  interpreted  lan- 
guage, then  only  a  few  well-defined  primitives  would  exist  in  the  language 
and  all  other  operations  would  have  to  be  defined  in  terms  of  these  primi- 
tives for  each  ISP.  ISP  2  and  ISP  3  describe  how  the  various  data-types 
and  operations  are  declared. 

4.6  Instruction  Interpretation  Process.  In  the  definition  of  processors,  the 
only  part  that  is  executed  is  the  instruction  interpreter.  All  the  other  parts 
are  memory  data  declarations  and  processes  to  be  carried  out  as  an  indirect 
consequence  of  the  interpretation  process.  The  format  for  most  interpreters 
is  the  familiar  fetch-the-instruction  then  execute-the-instruction  pair  of 
states,  and  consists  of  only  one  ISP  statement. 


Run  — >  (instruction  «—  M[PC];       fetch  {PC/program  counter) 
PC  <-  PC  -I-  I;  next 

Instruction„execution)  execute 

In  more  complex  processors  the  conditions  for  trapping  and  interrupting 
must  be  described.  .\lso,  in  the  interpretation  process  it  is  often  more 
descriptive  to  carry  out  part  of  effective  address  calculation  prior  to  In- 
stniction„execution.  See  below. 


— I  interrupt  A  Run  — > 

(op[0]  ^  M[PC];  PC  ^  PC  -I-  I;  next  fetch 
long  instruction  — >  ^ 

(op[l]  ^  M[PC];(f6p[l]  ^  M[PC];  fetch  more  instruction 

PC  <^  PC  -I-  I);  next  if  a  long  instruction 

Instruction^execution)  execute 

interrupt  A  Run      (M[0] «—  PC;  PC  «-  1;      interrupt,  save 

interrupt  <-  0)  PC  and  go  to  M[l] 

The  IBM  1401  interpreter  (Chap.  18)  requires  a  separate  process  to  fetch 
the  operands  addresses  prior  to  execution  in  a  variable-length  instruction. 
The  fetch  is  based  on  the  specific  instruction  to  be  executed  next. 


Appendix  637 


EXAMPLE 


Run      (op  ^  M[PC];  PC  ^  PC  +  1;  next  fetch 

Fetch_operands_addresses;  next  fetch  operands 

lnstruction„execution)  execute 

4. 7  Instruction-Set  and  Instruction  Execution  Process.  The  instruction-set 
and  the  process  by  which  each  instmction  is  executed  are  usually  given 
together  in  a  single  definition.  This  process  is  called  Instruction  execution 
in  all  the  ISP  descriptions  in  this  book.  It  usually  includes  the  definition  of 
the  conditions  for  execution,  the  instmction  (i.e.,  its  operation  code)  the 
name  of  the  instruction,  its  mnemonic  name,  and  the  process  for  execution 


Instruction^execution  :  =  ( 
add^(A^A  +  M[z/; 

opr-^  (qqq); 
and-*(A^AM[q])) 
where 


end  Instruction^execution 


qqq  :  =  (cb  ^    (A  ^  0);  next      secondary  definition 
cnib  ^  (A  «  ^A): 


pi         (A  «-  A  +  1))     end  qiiq  definition 


Bibliography 


Abbreviations 


Journals 

ACM 

ADC 
AFIPS 

AIEE-IRE  Conf. 

Appl.  Sci.  Res. 
EJCC 

FJCC 
S/CC 
WJCC 

IBM  ].  of  Res.  and  Dev. 

IBM  Sys.  J. 
ICIP 

lEE 

IEEE 

IFIP 

IRE 

Psychology  Rev. 


General 
Bull. 
Comm. 
Conf. 
Cong. 
J. 

Proc. 


Association  for  Computing  Machin- 
ery 

Automatic    Digital  Computation 

American  Federation  of  Informa- 
tion Processing  Societies 

American  Institute  of  Electrical 
Engineers— Institute  of  Radio  En- 
gineers Conference 

Applied  Scientific  Research 

Eastern  Joint  Computer  Confer- 
ence 

Fall  Joint  Computer  Conference 

Spring  Joint  Computer  Conference 

Western  Joint  Computer  Confer- 
ence 

IBM  Journal  of  Research  and  De- 
velopment 

IBM  Systems  Journal 

International  Conference  on  Infor- 
mation Processing 

Institution  of  Electrical  Engineers, 
London 

Institute  of  Electrical  and  Elec- 
tronics Engineers 

International  Federation  for  Infor- 
mation Processing 

Institute  of  Radio  Engineers 

Psychology  Review 


Bulletin 

Communications 

Conference 

Congress 

Journal 

Proceedings 


Pt. 

Res.  Rept. 
Supp. 
Symp. 
Trans. 


Part 

Research  Report 
Supplement 
Symposium 
Transactions 


Reports,  manuals,  and  miscellaneous 

"Study  of  a  Computer  Directly  Implementing  an  Algebraic  Lan- 
guage," AD633-727,  Air  Force  Office  of  Scientific  Research 
Contract  AF19(628)-2798. 

Control  Data  6600  Computer  System  Reference  Manual,  1st  ed. 
Publ.  450,  Copyright  (f)  1963,  Control  Data  Corporation,  Min- 
neapolis 20,  Minn. 

"Digital  Small  Computer  Handbook,"  1967  Edition,  Copyright 
©  1967,  all  rights  reserved,  Digital  Equipment  Corporation, 
Maynard,  Mass. 

Programmed  Buffered  Display  338  Programming  Manual— 
PDP-8,  DEC08-G61C-D,  Copyright©  1967,  all  rights  reserved. 
Digital  Equipment  Corporation,  Maynard,  Mass. 

A22-6703,  IBM  7094  Principles  of  Operation,  Data  Processing 
System,  Copyright  ©  1959,  1960,  1961,  1962,  International 
Business  Machines  Corporation. 
A22-6821-4  IBM  System/360  Principles  of  Operation. 

A22-6810-8  IBM  System/360  System  Summary. 

IBM  System/360  Functional  Characteristics  Manuals  for  each 
Model 

IBM  System/360  Configurator  (diagram)  for  each  Model. 

IBM  OS/360:  PL/I  Language  Specification,  Form  C28-6571, 
p.  74. 

H20-0223-0,  IBM  System /360  Attached  Support  Processor 
System  (ASP)  System  Description,  Copyright©  1966,  Interna- 
tional Business  Machines  Corporation. 

A24-1403-5,  IBM  1401  Reference  Manual,  Data  Processing 
System,  Copyright©  1960,  1961,  1962,  International  Business 
Machines  Corporation. 

225-6487-3,  IBM  1401  Customer  Engineering  Reference  Man- 
ual, Copyright©  1960,  1961,  1962,  1963,  International  Busi- 
ness Machines  Corporation. 


638 


Bibliography  639 


A26-5919-4,  IBM  1800  Data  Acquisition  and  Control  System 
Configurator. 

A26-5918-5,  IBM  1800  Functional  Characteristics,  Copyright© 
1966,  International  Business  Machines  Corporation. 

IBM  1620  FORTRAN:  Preliminary  Specifications,  Form  J29- 
4200-2,  April,  1960. 

FORTRAN  Specifications  and  Operating  Procedures,  IBM  1401, 
IBM  Systems  Ref.  Lib.  C24-1455-2. 

International  Business  Machines  Corporation,  General  Infor- 
mation Manual  FORTRAN,  Form  F28-807401,  December,  1961. 

Type  650  Magnetic  Drum  Data-processing  Machine  (Manual  of 
Operations),  Form  22-60  60-1,  International  Business  Machines 
Corporation,  New  York,  1955. 

Librascope  LGP-30,  Manual,  Librascope,  Inc.,  80  Western  Ave., 
Giendale,  Calif. 

Olivetti  Underw/ood  Programma  101  General  Reference  Manual, 
Olivetti  Underwood  Corporation,  One  Park  Avenue,  New  York, 
10016. 

Pegasus  Maintenance  Manuals,  Ferranti  Ltd.,  London. 
Pegasus  Programming  Manual,  Ferranti  Ltd.,  London. 
Proceedings  Conference  on  Spaceborne  Computer  Engineering, 
Anaheim,  Calif.,  Oct.  30-31,  1962. 

Scientific  Data  Systems  Reference  Manual,  SDS  930  Computer, 
Copyright  (P)  1965,  1966,  1967,  Scientific  Data  Systems,  Inc., 
1649  Seventeenth  Street,  Santa  Monica,  Calif. 
Scientific  Data  Systems  Reference  Manual,  SDS  9300  Computer, 
Copyright  (0  1963,  1964,  1965,  1966,  1967,  Scientific  Data 
Systems,  Inc.,  1649  Seventeenth  Street,  Santa  Monica,  Calif. 
Symposium  on  Multi-programming  (Concurrent  Programs), 
Information  Processing,  1962  Proc.  IFIP  Congress,  pp.  570- 
575,  North-Holland  Publishing  Company,  Amsterdam,  1963. 
Univac  Scientific  Electronic  Computing  System  Model  1103A, 
Form  EL338,  Remington-Rand  Corporation,  1902  West  Minne- 
haha Ave.,  St.  Paul  W4,  Minn. 

"Comprehensive  System  Manual,  A  System  of  Automatic  Cod- 
ing for  the  Whirlwind  Computer,"  Digital  Computer  Laboratory, 
Massachusetts  Institute  of  Technology,  Cambridge  39,  Mass., 
August,  1955;  revised,  December,  1955. 


Books  and  periodicals 

AdamA60  Adams  Associates:  Computer  Characteristics 
Quarterh/.  summary  of  the  characteristics  of 
computers  being  currently  manufactured,  Cam- 


bridge, Mass.  Specific  quarterlies  used:  January, 
1966,  vol.  6;  no.  1;  1st  and  2nd  quarters,  1967, 
vol.  7,  nos.  1,  2;  4th  quarter,  1967,  and  1st 
quarter,  1968,  vol.  7,  no.  4,  vol.  8,  no.  1,  (first 
published  in  1950). 

AdamC60  Adams,  C.  W.:  A  Chart  for  EDP  Experts,  Datama- 
tion, vol.  6,  pp.  13-17,  November-December, 
1960.  See  AdamA60. 

AdamC62  Adams,  Charles  W.:  Grosch's  Law  Repealed, 
Datamation,  vol.  8,  no.  7,  pp.  38-39,  July,  1962. 

AinsE52  Ainsworth,  Ernest:  SEAC  Input-Output  Operat- 

ing Experience,  AIEE-IRE-ACM  Conf.,  pp.  44- 
47,  December,  1952. 

AlexS51  Alexander,  S.  N.:  The  National  Bureau  of  Stand- 

ards Eastern  Automatic  Computer  (SEAC), 
AlEE-IRE  Conf.,  pp.  84-89,  December,  1951. 

AllaR64  Allard,  R.  W.,  K.  A.  Wolf,  and  R.  A.  Zemlin:  Some 

Effects  of  the  6600  Computer  on  Language 
Structures,  Comm.  ACM,  vol.  7,  no.  2,  pp.  112- 
119,  February,  1964. 

AlleM63  Allen,  M.  W.,  T.  Pearcey,  J.  P.  Penny,  G.  A.  Rose, 

and  J.  G.  Sanderson:  CIRRUS,  An  Economical 
Multiprogram  Computer  with  Microprogram 
Control,  IEEE  Trans.,  vol.  EC-12,  no.  6,  pp. 
663-671,  December,  1963. 

AllmR62  Allmark,  R.  H.,  and  J.  R.  Lucking:  Design  of  an 

Arithmetic  Unit  Incorporating  a  Nesting  Store, 
Proc.  IFIP  Cong.  1962,  pp.  694-698,  1962. 

AlonR60  Alonso,  R.  L.,  and  J.  H.  Laning,  Jr.:  Design 

Principles  for  a  General  Control  Computer,  In- 
stitute of  Aeronautical  Sciences,  New  York, 
S.  M.  Fairchild  Publ.  Fund  Paper  FF-29,  April, 
1960. 

AlonR61  Alonso,  R.  L.,  J.  H.  Laning,  Jr.,  and  H.  Blair- 

Smith:  Preliminary  MOD  3C  Programmers  Man- 
ual, M.I.T.  In.strumentation  Lab.,  Kept.  E-1077, 
1961. 

AlonR62  Alonso,  R.  L.,  A.  Green,  H.  Maurer,  and  R. 

Oleksiak:  A  Digital  Control  Computer;  Develop- 
ment Model  IB,  M.I.T.  Instrumentation  Lab., 
Kept.  R-358  (confidential),  April,  1962. 

AlonR53  Alonso,  R.  L.,  H.  Blair-Smith,  and  A.  L.  Hopkins: 

Some  Aspects  of  the  Logical  Design  of  a  Control 
Computer,  A  Case  Study,  IEEE  Trans.,  vol. 
EC-12,  no.  6,  pp.  687-597,  December,  1953. 

AmdaG62        Amdahl,  Gene  M.:  New  Concepts  in  Computing 


640  Bibliography 


System  Design,  Proc.  IRE,  vol.  50,  no.  5,  pp. 
1073-1077,  May,  1962. 

AmdaG64(7      Amdahl,  G.  M.,  G.  A.  Blaauw,  and  F.  P.  Brooks,  BarnG68 
Jr.:  Architecture  of  the  IBM  System/360,  IBM 
J.  Res.  and  Dev..  vol.  8,  no.  2,  pp.  87-101,  April, 
1964.  Review  TeagH65 

AmdaG64/;      Amdahl,  G.  M.:  Processing  Unit  Design  Consid- 
erations, IBM  Sijs.  J.,  vol.  3,  no.  2,  pp.  144-164,  BartR61 
1964. 

AmdaG64f       Amdahl,  G.  M.:  The  Model  92  as  a  Member  of 

the  System  360  Family,  AFIPS  Pwc.  FJCC,  Pt.  II.  BashT64 
vol.  26,  pp.  69-72,  1964.  Review  GrimR65b 

AndeD67         Anderson,  D.  W.,  F.  J.  Sparacio,  and  R.  M. 

Tomasulo:  The  IBM  System/360  Model  91:  Ma-  BashT67 
chine  Philosophy  and  Instruction  Handling, /BA/ 
/.  of  Res.  and  Dev.,  vol.  11,  no.  1,  pp.  8-24, 
January,  1967. 

AndeJ61         Anderson,  James  P.:  A  Computer  for  Direct  Basil57 
Execution  of  Algorithmic  Languages,  AF/PSFroc. 
EJCC.  vol.  20,  pp.  184-193,  1961. 

AndeJ62         Anderson,  James  P.,  Samuel  A.  Hoffman,  BeckF61 
Joseph   Shifman,   and   Robert  J.  Williams: 
D825-A  Multiple  Computer  System  for  Com- 
mand and  Control,  AFIPS  Proc.  FJCC,  vol.  22, 
pp.  86-96,  1962. 

AndeJ65         Anderson,  James  P.:  Program  Structures  for  BernA58 
Parallel  Processing,  Comm.  ACM.  vol.  8,  no.  12, 
pp.  786-788,  December,  1965. 

AndeS67         Anderson,  S.  F.,  J.  G.  Earle,  R.  E.  Goldschmidt,  BhusA67 
and  D.  M.  Powers:  The  IBM  System/360  Model 
91:  Floating-point  Execution  Unit,  IBM  J.  of  Res. 
and  Dev..  vol.  11,  no.  1,  pp.  34-53,  January, 
1967.  BlaaG59 

ArbuR66  Arbuckle,  R.  A.:  Computer  Analysis  and  Thruput 
Evaluation,  Computers  and  Automation,  p.  13, 
January,  1966.  BlaaG64(; 

ArdeB66  Arden,  B.  W.,  B.  A.  Galler,  T.  C.  O'Brien,  and 
F.  H.  Westervelt:  Program  and  Addressing  Struc- 
ture in  a  Time-sharing  Environment,/.  ACM,  vol. 
13,  no.  1,  pp.  1-16,  January,  1966.  BlaaG64/) 

AstrM52  Astrahan,  M.  M.,  and  N.  Rochester:  The  Logical 

Organization  of  the  New  IBM  Scientific  Calcula-  BlocE59 

tor,  Proc.  ACM.  Pittsburgh  Conf..  pp.  79-83,  May, 

1952 

BlosR60 

BaldF62  Baldwin,  F.  R.,  W.  B.  Gibson,  and  C.  B.  Poland: 

A  Multiprocessing  Approach  to  a  Large  Com- 


puter System,  IRM  Sijs.  J.,  vol.  1,  pp.  64-76, 
September,  1962. 

Barnes,  George  H.,  Richard  M.  Brown,  Maso 
Kato,  David  J.  Kuck,  Daniel  L.  Slotnick,  and 
Richard  A.  Stokes:  The  ILLIAC  IV  Computer, 
IEEE  Trans.,  vol.  C-17,  no.  8,  pp.  746-757, 
August,  1968. 

Barton,  R.  S.:  A  New  Approach  to  the  Functional 
Design  of  a  Digital  Computer,  Proc.  WJCC,  pp. 
393-396,  1961. 

Bashkow,  T.  R.:  A  Sequential  Circuit  for  Alge- 
braic Statement  Translation,  IEEE  Trans.,  vol. 
EC-13,  no.  2,  pp.  102-105,  April,  1964. 

Bashkow,  Theodore,  Azra  Sasson,  and  Arnold 
Kronfeld:  System  Design  of  a  FORTRAN  Ma- 
chine, IEEE  Trans.,  vol.  EC- 16,  no.  4,  pp.  485- 
499,  August,  1967. 

Basilewskii,  lu.  la.:  The  Universal  Electronic 
Digital  Machine  (URAL)  for  Engineering  Re- 
search,/. ACM,  vol.4,  no.  2,  pp.  511-519,  1957. 

Beckman,  F.  S.,  F.  P.  Brooks,  Jr.,  and  W.  J. 
Lawless,  Jr.:  Developments  in  the  Logical  Orga- 
nization of  Computer  Arithmetic  and  Control 
Units,  Proc.  IRE.,  vol.  49,  no.  1,  pp.  53-66, 
January,  1961. 

Bernstein,  A.,  M.  De  V.  Roberts,  T.  Arbuckle,  and 
M.  A.  Belsky:  A  Chess  Playing  Program  for  the 
IBM  704,  Proc.  WJCC,  pp.  157-159,  1958. 

Bhushan,  A.,  R.  H.  Stotz,  and  J.  E.  Ward:  Rec- 
ommendations for  an  Intercomputer  Commu- 
nications Network  for  M.l.T.  Memorandum 
MAC-M-35.5,  July,  1967. 
Blaauw,  G.  A.:  Indexing  and  Control-word  Tech- 
niques, IBM  J.  of  Res.  and  Dev..  vol.  3,  no.  2, 
pp.  288-301,  July,  1959. 

Blaauw,  G.  A.,  and  F.  P.  Brooks,  Jr.:  The  Struc- 
ture of  System/360,  Part  I— Outline  of  the  Logi- 
cal Structure,  IBM  Si/s.  /..  vol.  3,  no.  2,  pp. 
119-135,  1964. 

Blaauw,  G.  A.:  Multisystem  Organization,  IBM 
Sys.  J.,  vol.  3,  no.  2,  pp.  181-195,  1964. 

Bloch,  Erich:  The  Engineering  Design  of  the 
Stretch  Computer,  Proc.  EJCC,  pp.  48-58,  1959. 

Blosk,  R.  T.:  The  Instruction  Unit  of  the 
STRETCH  Computer,  Proc.  EJCC,  pp.  299-324, 
1960. 


Bibliography  641 


BockR63  Bock,  R.  V.:  An  Interrupt  Control  for  the  B  5000 
Data  Processor  System,  AFIPS  Proc.  FJCC,  vol. 
24,  pp.  229-241,  1963. 

BolaL67  Boland,  L.  J.,  G.  D.  Granito,  A.  U.  Marcotte, 

B.  U.  Messina,  and  J.  W.  Smith:  The  IBM  Sys- 
tem/360 Model  91:  Storage  System,  IBM  J.  of 
Res.  and  Dev..  vol.  1 1,  no.  1,  pp.  54-68,  January, 
1967. 

BoutE63  Boutwell,  E.,  Jr.,  and  E.  A.  Hoskinson:  The 

Logical  Organization  of  the  PB  440  Micropro- 
grammableComputer,  AF/PS  Proc.  FJCC,  vol.  24, 
pp.  201-213,  1963. 

BowdB53  Bowden,  B.  V.,  editor:  "Faster  than  Thought," 
Sir  Isaac  Pitman  and  Sons,  Ltd.,  London,  1953. 

BrigH64  Bright,  H.  S.:  A  Phiico  Multiprocessing  System, 

AFIPS  Proc.  FJCC.  pt.  II,  vol.  26,  pp.  97-141, 
1964. 

BrooF57(i  Brooks,  F.  P.,  Jr.:  A  Program-controlled  Pro- 
gram Interruption  System,  Proc.  EJCC,  pp.  128- 
132,  1957. 

BrooF57/)        Brooks,  F.  P.,  Jr.,  A.  L.  Hopkins,  Jr.,  P.  G. 

Neumann,  and  M.  V.  Wright:  An  Experiment  in 
Musical  Composition,  IRE  Tram.,  vol.  EC-6,  no. 
3,  pp.  175-182,  September,  1957. 

BrooF59  Brooks,  F.  P.,  Jr.,  G.  A.  Blaauw,  and  W.  Buch- 

holz:  Processing  Data  in  Bits  and  Pieces,  IRE 
Trans.,  vol.  EC-8,  no.  2,  pp.  118-124,  June, 
1959. 

BrooF60  Brooks,  F.  P.:  The  Execute  Operations,  A  Fourth 

Mode  of  Instruction  Sequencing,  Comm.  ACM, 
vol.  3,  no.  3,  pp.  168-170,  March,  1960. 

BrooR60  Brooker,  R.  A.:  Some  Techniques  for  Dealing 

with  Two-level  Storage,  Computer  J.,  vol.  2,  pp. 
189-194,  1960. 

BuchW53  Buchholz,  Werner:  The  System  Design  of  the 
IBM  Type  701  Computer,  Proc.  IRE.  vol.  41,  no. 
10,  pp.  1262-1275,  October,  1953. 

BuchW57  Buchholz,  W.:  Design  Objectives  for  the  IBM 
STRETCH  Computer,  New  Computers.  Rept. 
from  the  Manufacturers  ACM  Conf.  pp.  99-104, 
1957. 

BuchW58  Buchholz,  W.:  The  Selection  of  an  Instruction 
Language,  Proc.  WJCC,  pp.  128-130,  1958. 

BuchW62  Buchholz,  Werner,  (ed.):  "Planning  a  Computer 
System,"  McGraw-Hill  Book  Company,  New 
York,  1962. 


BurdE53  Burdette,  E.  W.:  Characteristics  of  the  Oracle, 

Argonne  Natl.  Lab.,  Proc.  Stjmp.  on  Lar^e  Scale 
Digital  Computing  Machines,  pp.  194-201,  Au- 
gust, 1953. 

BurkA62f;  Burks,  Arthur  W.,  Herman  H.  Goldstine,  and 
John  von  Neumann:  Preliminary  Discussion  of 
the  Logical  Design  of  an  Electronic  Computing 
Instrument,  Part  I,  Datamation,  vol.  8,  no.  9,  pp. 
24-31,  September,  1962. 

BurkA62/)  Burks,  Arthur W.,  Herman  H.  Goldstine,  and  John 
von  Neumann:  Preliminary  Discussion  of  the 
Logical  Design  of  an  Electronic  Computing  In- 
strument, Part  II,  Datamation,  vol.  8,  no.  10,  pp. 
36-41,  October,  1962. 

BurkA63  Burks,  Arthur  W.,  Herman  H.  Goldstine,  and 

John  von  Neumann:  Preliminary  Discussion  of 
the  Logical  Design  of  an  Electronic  Computing 
Instrument  (Pt.  I,  vol.  1),  Rept.  prepared  for  U.S. 
Army  Ordnance  Dept.,  1946,  in  A.  H.  Taub  (ed.), 
"Collected  Works  of  John  von  Neumann,"  vol.  5, 
pp.  34-79,  The  Macmillan  Company,  New  York, 
1963. 

BussB63  Bussell,  B.,  and  G.  Estrin:  An  Evaluation  of  the 
Effectiveness  of  Parallel  Processing,  IEEE 
Pacific  Computer  Conf.  pp.  201-220,  1963. 

CampR52  Campbell,  Robert  V.  D.:  Evolution  of  Automatic 
Computing,  Proc.  ACM.  Pittsburgh  Conf.  pp. 
29-32,  May,  1952. 

CarlC63  Carlson,  C.  B.:  The  Mechanization  of  a  Push- 

down Stack,  AFIPS  Proc.  FJCC.  vol.  24,  pp. 
243-250,  1963. 

CarrJ56  Carr,  J.  W.,  Ill,  and  N.  R.  Scott  (eds.):  "Notes 

on  the  Special  Summer  Conference  on  Digital 
Computers,"  Special  Summer  Conferences  on 
Digital  Computers,  University  of  Michigan,  Ann 
Arbor,  Mich.,  1956. 

CarrJ59  Carr,  John  W.,  Ill:  Programming  and  Coding, 

in  Eugene  M.  Grabbe,  Simon  Ramo,  and  Dean  E. 
Wooldridge  (eds.),  "Handbook  of  Automation, 
Computation,  and  Control,"  vol.  2,  chap.  2,  pp. 
77-83,93-98,  111-1 15, 1 15-121,  John  Wiley  & 
Sons,  Inc.,  New  York,  1959. 

CartW64  Carter,  W.  C,  H.  C.  Montgomery,  R.  J.  Preiss, 
and  H.  J.  Reinheimer:  Design  of  Serviceability 
Features  for  the  IBM  System/360,  IBM  J.  of  Res. 
anrf  Dec.  vol.8,  no.  2,  pp.  115-125,  April,  1964. 


642  Bibliography 


CasaC62  Casale,  Charles  T.:  Planning  the  CDC  3600, 
AFIPS  Proc.  FJCC.  vol.  22,  pp.  73-85,  1962. 

ChasG52  Chase,  George  C:  History  of  Mechanical  Com- 
puting Machinery,  Proc.  ACM  Pitfsbuiah,  Conf., 
pp.  1-28,  May,  1952. 

ChenT64  Chen,  T.  C:  The  Overlap  of  the  IBM  System/360 
Model  92  Central  Processing  Unit,  AHPS  Proc. 
FJCC.  Pt.  II,  vol.  26,  pp.  73-80,  1964.  Review 

GrimR65c 

ChuC52  Chu,  J.  C:  The  Oak  Ridge  Automatic  Computer, 

Proc.  ACM.  Toronto  Conf..  pp.  142-148,  Septem- 
ber, 1952. 

ClarW57  Clark,  Wesley  A.;  The  Lincoln  TX-2  Computer 

Development,  Proc.  WJCC.  pp.  143-145,  1957. 

ClayB64  Clayton,  B.  B.,  E.  K.  Dorff,  and  R.  E.  Fagen:  An 

Operating  System  and  Programming  Systems 
for  the  6600,  AFIPS  Proc.  FJCC.  Pt.  II,  vol.  26, 
pp.  41-57,  1964. 

CochD68  Cochran,  David  S.:  Internal  Programming  of  the 
9100A Calculator,  Ilcwlclt-Packarcl  J.,  vol.  20,  no. 

I,  pp.  14-16,  September,  1968. 

CoddE59  Codd,  E.  F.,  E.  S.  Lowry,  E.  McDonough,  and 
C.  A.  Scaizi:  Multiprogramming  STRETCH  Fea- 
sibility Considerations,  Comm.  ACM.  vol.  2,  no. 

II,  pp.  13-17,  November,  1959. 

CoddE62  Codd,  E.  F.:  Multiprogramming,  "Advances  in 
Computers,"  vol.  3,  pp.  78-153,  Academic 
Press,  Inc.,  New  York,  1962. 

ComfW65  Comfort,  W.  T.:  A  Computing  System  Design  for 
User  Service,  AFIPS  Proc.  FJCC,  Pt.  I,  vol.  27,  pp. 
619-626,  1965. 

ContC64  Conti,  Carl:  System  Aspect:  System/360  Model 
92,  AFIPS  Proc.  FJCC.  Pt.  II,  vol.  26,  pp.  81-95, 
1964.  Review  GrimR65((. 

ContC68  Conti,  C.  J.,  D.  H.  Gibson,  and  S.  H.  Pitkowsky: 
Structural  Aspects  of  the  System/360  Model  85, 
I.  General  Organization,  IBM  Sijs.  J.,  vol.  7,  no. 
1,  pp.  2-14,  1968. 

ConwM58  Conway,  Melvin  E.:  Proposal  for  an  UNCOL, 
Comm.  ACM.  vol.  1,  no.  10,  pp.  5-8.  October, 
1958. 

ConwM63  Conway,  M.  E.:  A  Multiprocessor  System  Design, 
AFIPS  Proc.  FJCC.  vol.  24,  pp.  139-146,  1963. 

CorbF62  Corbato,  Fernando  J.,  Marjorie  Merwin-Daggett, 

and  Robert  C.  Daley:  An  Experimental  Time- 


sharing System,  AFIPS  Proc.  SJCC,  vol.  21,  pp. 
335-344,  1962. 

CorbF65  Corbato,  F.  J.,  and  V.  A.  Vyssotsky:  Introduction 

and  Overview  of  the  MULTICS  System,  AFIPS 
Proc.  FJCC,  Pt.  I,  vol.  27,  pp.  185-196,  1965. 

CoxJ68  Cox,  Jerome  R.,  Jr.:  Economy  of  Scale  and 

Specialization  in  Large  Computing  Systems, 
Computer  Design,  vol.  7,  no.  11,  pp.  77-80, 
November,  1968. 

CrawP??  Crawford,  P.:  Thesis  for  Master's  Degree,  Mas- 

sachusetts Institute  of  Technology,  Cambridge, 
Mass. 

CritA63  Critchlow,  A.  J.:  Generalized  Multiprocessing 

and  Multiprogramming  Systems,  AFIPS  Proc. 
FJCC,  vol.  24,  pp.  107-126,  1963. 

DaleR65  Daley,  R.  C,  and  P.  G.  Neumann:  A  General- 

purpose  File  System  for  Secondary  Storage, 
AFIPS  Proc.  FJCC.  Pt.  I,  vol.  27,  pp.  213-229, 
1965. 

DaleR68  Daley,  Robert  C,  and  Jack  B.  Dennis:  Virtual 

Memory,  Processes,  and  Sharing  in  MULTICS, 
Comm.  ACM.  vol.  11,  no.  5,  pp.  306-312,  May, 
1968. 

DarrJ69  Darringer,  John  A.:  The  Description,  Simulation, 

and  Automatic  Implementation  of  Digital  Com- 
puter Processors,  Thesis  for  Ph.D.  degree.  Car- 
negie-Mellon University,  College  of  Engineering 
and  Science,  Department  of  Electrical  Engi- 
neering, Pittsburgh,  Pa.,  May,  1969. 

DaviD67  Davies,  D.  W.,  K.  A.  Bartlett,  R.  A.  Scantlebury, 

and  P.  T.  Wilkinson:  A  Digital  Communication 
Network  for  Computers  Giving  Rapid  Response 
at  Remote  Terminals,  ACM  Symp.  on  Operating 
Si/stem  Principles.  Gatlinhurg.  Tenn..  Oct.  1-4, 
1967. 

DaviG60  Davis,  G.  M.:  The  English  Electric  KDF9  Com- 

puter System,  Computer  Bull.  pp.  119-120, 
December,  1960. 

DennJ65  Dennis,  J.  B.:  Segmentation  and  the  Design  of 
Multiprogrammed  Computer  Systems,  J.  ACM. 
vol.  12,  no.  4,  pp.  589-602,  October,  1965. 

DennJ56  Dennis,  J.,  and  E.  C.  Van  Horn:  Programming 

Semantics  for  Multiprogrammed  Computations, 
Comm.  ACM,  vol.  9,  no.  3,  pp.  143-155,  March, 
1966. 


Bibliography  643 


DesmW64  Desmonde,  W.  H.:  "Real  Time  Data  Processing 
Systems,"  Prentice-Hall,  Inc.,  Englewood  Cliffs, 
N.J.,  1964. 

DijkE55  Dijkstra,  E.  W.:  Solution  of  a  Problem  in  Con- 

current Programming  Control,  Comm.  ACM.  vol. 
8,  no.  9,  p.  569,  September,  1965. 

DreyP58  Dreyfus,  P.:  System  Design  of  the  Gamma  60, 

Proc.  WJCC.  pp.  130-133,  May,  1958. 

DunwS56  Dunwell,  S.  W.:  Design  Objectives  for  the  IBM 
STRETCH  Computer,  Proc.  EJCC,  pp.  20-22, 
1956. 

EcclW19  Eccles,  W.  H.,  and  F.  W.  Jordan:  A  Trigger  Relay, 

Radio  Rci..  pp.  143-146,  October,  1919. 

EckeJ51  Eckert,  J.   Presper,  Jr.,  James  R.  Weiner, 

H.  Frazer  Welsh,  and  Herbert  F.  Mitchell:  The 
UNIVAC  System,  AIEE-IRE  Conf..  pp.  6-16, 
December,  1951. 

EckeJ59  Eckert,  J.  P.,  J.  C.  Chu,  A.  B.  Tonik,  and  W.  J. 

Schmitt:  Design  of  Univac-LARC  System,  Part 

I,  Proc.  EJCC.  pp.  59-65,  1959. 

EdwaD60  Edvi/ards,  D.  B.  G.,  M.  J.  Lanigan,  and  T.  Kilburn: 
Ferrite-core  Memory  Systems  with  Rapid  Cycle 
Times,  Proc.  lEE.  pt.  B,  vol.  107,  pp.  585-598, 
November,  1960. 

ElboR53  Elbourne,  R.  D.,  and  R.  P.  Witt:  Dynamic  Circuit 

Techniques  Used  in  SEAC  and  DYSEAC,  IRE 
Trans.,  vol.  EC-2,  no.  1,  pp.  2-9,  1953. 

ElliW51  Elliott,  W.  S.:  Circuit  Standardization  in  Series 

Working,  High-speed  Digital  Computers,  Elliott 
J.,  vol.  1,  no.  2,  p.  49,  September,  1951:  also 
in  Proc.  ACM.  March,  1950. 

ElliW52  Elliott,  W.  S.,  H.  G.  Carpenter,  and  C.  E.  Owen: 

Development  of  Computer  Components  and 
Systems,  Proc.  ACM.  Toronto  Conf.,  September, 
1952. 

ElliW53  Elliott,  W.  S.,  H.  G.  Carpenter,  and  A.  St.  Johns- 

ton: The  Elliott-NRDC  Computer  401,  A  Demon- 
stration of  Computer  Engineering  by  Packaged 
Unit  Construction,  Symp.  ADC.  pp.  273-276, 
1953. 

ElliW56a  Elliott,  W.  S.,  C.  E.  Owen,  C.  H.  Devonald,  and 

B.  G.  Maudsley:  The  Design  Philosophy  of  Peg- 
asus, A  Quantity-production  Computer,  Proc. 
lEE.  Pt.  B,  vol.  103,  Supp.  2,  pp.  188-196,  1956. 

ElliW56b  Elliott,  W.  S.,  R.  C.  Robbins,  and  D.  S.  Evans: 

Remote  Position  Control  and  Indication  by  Digi- 


tal Means,  Proc.  lEE.  Pt.  B,  vol.  103,  Supp.  3, 
pp.  437-446,  1956. 

EnglW62  England,  W.  A.:  Subminiature  Computer  De- 
signed for  Space  Environments,  Proc.  Conf.  on 
Spaceborne  Computer  Engineering,  Anaheim, 
Calif,  pp.  95-101,  October,  1962. 

ErnsH63  Ernst,  H.  A.:  TCS,  An  Experimental  Multipro- 

gramming System  for  the  IBM  7090,  IBM  Ra. 
Rcpt.  RJ248,  41  pp.,  Yorktown  Hts.,  N.Y.,  June, 
1963. 

EstrG52  Estrin,  G.:  A  Description  of  the  Electronic  Com- 

puter at  the  Institute  for  Advanced  Studies,  Proc. 
ACM,  Toronto  Conf,  pp.  95-109,  September, 
1952. 

EstrG60  Estrin,  Gerald:  Organization  of  Computer  Sys- 

tems, the  Fixed  Plus  Variable  Structure  Com- 
puter, Proc.  WJCC.  pp.  33-40,  1960. 

EstrG63  Estrin,  G.,  B.  Bussell,  R.  Turn,  and  J.  Bibb: 

Parallel  Processing  in  a  Restructurable  Com- 
puter System,  IEEE  Trans.,  vol.  EC- 12,  no.  6,  pp. 
747-755,  December,  1963.  Article  reviewed  by 
E.  G.  Newman  in  IEEE  Trans.,  vol.  EC-13,  no. 
5,  p.  649,  October,  1964. 

EverR51  Everett,  R.  R.:  The  Whirlwind  I  Computer, 

AIEE-IRE  Conf,  pp.  70-74,  1951. 

EverR57  Everett,  R.  R.,  C.  A.  Zraket,  and  H.  D.  Bening- 

ton:  SAGE— A  Data-processing  System  for  Air 
Defense,  Proc.  EJCC.  pp.  148-155,  1957. 

EwinR64  Ewing,  R.  G.,  and  P.  M.  Davies:  An  Associative 
Processor,  .\EIPS  Proc.  EJCC.  Pt.  I,  vol.  26,  pp. 
147-158,  1964. 

FaggP64  Fagg,  P.,  J.  L.  Brown,  J.  A.  Hipp,  D.  T.  Doody, 

J.  W.  Fairclough,  and  J.  Greene:  IBM  Sys- 
tem/360 Engineering,  AF/PS  Proc.  EJCC.  Pt.  I, 
vol.  26,  pp.  205-231,  1964. 

FairJ56  Fairclough,  J.  W.:  A  Sonic  Delay-line  Storage 

Unit  for  a  Digital  Computer,  Proc.  lEE.  Pt.  B,  vol. 
103,  Supp.  3,  pp.  491-496,  1956. 

FalkA64  Falkoff,  A.  D.,  K.  E.  Iverson,  and  E.  H.  Sus- 

senguth:  A  Formal  Description  of  System /360, 
IBM  Sijs.  J.,  vol.  3,  no.  3,  pp.  198-261,  1964. 

FikeR68  Fikes,  Richard  E.,  Hugh  C.  Lauer,  and  Albin  L. 

Vareha,  Jr.:  Steps  toward  a  General-purpose 
Time-sharingSystem  Using  Large  Capacity  Core 
Storage  and  TSS,'360,  Proc.  23rd  Natl.  Con  f  of 


644  Bibliography 


ACM,  Las  Vegas.  Nevada,  pp.  7-18,  August, 
1968. 

FlynM66  Flynn,  Michael  J.:  Very  High-speed  Computing 

Systems,  Proc.  IEEE,  vol.  54,  no.  12,  pp.  1901- 
1909,  December,  1966. 

FlynlVI67a  Flynn,  M.  J.,  and  P.  R.  Low:  The  IBM  Sys- 
tem/360 Model  91;  Some  Remarks  on  System 
Development,  IBM  J.  of  Res.  and  Dev..  vol.  11, 
no.  1,  pp.  2-7,  January,  1967. 

FlynM67fo  Flynn,  Michael  J.,  and  M.  Donald  MacLaren: 
Microprogramming  Revisited,  Ar^onne  Nail. 
Lab.,Appl.  Math.  Div..  Tech.  Mem.  134,  pp.  1-17, 
Argonne,  III.,  1967. 

ForgJ65  Forgie,  James  W,:  A  Time-  and  Memory-sharing 

Executive  Program  for  Quick  Response,  On-line 
Applications,  AFIPS  Proc.  FJCC,  Pt.  II,  vol.  27, 
pp.  127-139,  1965. 

ForrJSl  Forrester,  J.  W.:  Digital  Information  Storage  in 

Three  Dimensions  Using  Magnetic  Cores,  /, 
Appl.  Pht/s.,  vol.  22,  pp.  44-48,  January,  1951 

FothJ61  Fotheringham,  John:  Dynamic  Storage  Alloca 

tion  in  the  Atlas  Computer,  Including  an  Auto 
matic  Use  of  a  Backing  Store,  Comm.  ACM,  vol 
4,  no.  10,  pp.  435-436,  October,  1961. 

FranJ57  Frankovich,  J.  M.,  and  H.  P.  Peterson:  A  Func- 

tional Description  of  the  Lincoln  TX-2  Computer, 
Proc.  WJCC.  vol.  19,  pp.  146-155,  February, 
1957. 

FrizC53  Frizzell,  Clarence  E.:  Engineering  Description  of 

the  IBM  Type  701  Computer,  Proc.  IRE,  vol.  41, 
no.  10,  pp.  1275-1287,  October,  1953. 

GibsC66  Gibson,  C.  T.:  Time-sharing  in  the  IBM  Sys- 

tem/360: Model  67,  AFIPS  Proc.  S/CC,  vol.  28, 
pp.  51-78,  1966. 

GillS58  Gill,  S.:  Parallel  Programming,  Computer  J.,  vol. 

1,  no.  1,  pp.  2-10,  April,  1958. 

GlasE65  Glaser,  E.  L.,  J.  Couleur,  and  G.  Oliver:  System 

Design  of  a  Computer  for  Time  Sharing  Appli- 
cations, AFIPS  Proc.  FJCC,  Pt.  I,  vol.  27,  pp. 
197-202,  1965. 

GoldH63(/  Goldstine,  H.  H.,  and  John  von  Neumann:  On  the 
Principles  of  Large  Scale  Computing  Machines, 
unpublished,  1946;  in  A.  H.Taub(ed.),  "Collected 
Works  of  John  von  Neumann,"  vol.  5,  pp.  1-32, 
The  Macmillan  Company,  New  York,  1963. 


GoldH63/j  Goldstine,  H.  H.,  and  John  von  Neumann;  Plan- 
ning and  Coding  Problems  for  an  Electronic 
Computing  Instrument  (Pt.  II,  vol.  1),  Rept. 
prepared  for  U.S.  Army  Ordnance  Dept.,  1947, 
in  A.  H.  Taub  (ed.),  "Collected  Works  of  John 
von  Neumann,"  vol.  5,  pp.  80-151,  The  Mac- 
millan Company,  New  York,  1963. 

GoldH63c-  Goldstine,  H.  H.,  and  John  von  Neumann: 
Planning  and  Coding  of  Problems  for  an  Elec- 
tronic Computing  Instrument  (Pt.  II,  vol.  2), 
Rept.  prepared  for  U.S.  Army  Ordnance  Dept., 
1948,  in  A.  H.  Taub  (ed.),  "Collected  Works  of 
John  von  Neumann,"  vol.  5,  pp.  152-214,  The 
Macmillan  Company,  New  York,  1963. 

GoldH63(/  Goldstine,  H.  H.,  and  John  von  Neumann: 
Planning  and  Coding  of  Problems  for  an  Elec- 
tronic Computing  Instrument  (Pt.  II,  vol.  3), 
Rept.  prepared  for  U.S.  Army  Ordnance  Dept., 
1948,  in  A.  H.  Taub  (ed.),  "Collected  Works  of 
John  von  Neumann,"  vol.  5,  pp.  215-235,  The 
Macmillan  Company,  New  York,  1963. 

GreeJ57  Greenstadt,  J.  L.:  The  IBM  709  Computer,  New 

Computers,  Rept.  from  tlie  Manufacturers  ACM 
Conf.,  pp.  92-98,  1957. 

GreeJ64  Greene,  J.  E.,  R.  F.  Dean,  and  B.  M.  Updike: 

Micro-programmed  Implementation  of  the  IBM 
System / 360  Machine  Organization,  /BA/c;ciic"ra/ 
Products  Div.,  Development  Lab.,  Engineerinfi, 
Puhl,  Dept.  PTP792,  Endicott,  N.Y.,  April,  1964. 

GreeJ66  Green,  J.:  Microprogramming,  Emulators  and 

Programming  Languages,  Comm.  ACM,  vol.  9, 
no.  3,  pp.  230-231,  March,  1966. 

GreeS52  Greenwald,  Sidney;  SEAC  Input-Output  System, 

AIEE-IRE-ACM  Conf,  pp.  31-36,  December, 
1952. 

GreeS53  Greenwald,  Sidney,  R.  C.  Haueter,  and  S.  N. 

Alexander;  SEAC,  Proc.  IRE,  vol.  41,  no.  10,  pp. 
1300-1313,  October,  1953. 

GregJ63  Gregory,  J.,  and  R.  McReynolds;  The  SOLOMON 

Computer,  IEEE  Tram.,  vol.  EC-12,  no.  6,  pp. 
774-781,  December,  1963. 

GrimR65a  Grimsdale,  R.  L.:  A  Review  of  ContC64,  Com- 
puting Rev.,  vol.  6,  no.  6,  p.  430,  November, 
December,  1965. 

GrimR65fc  Grimsdale,  R.  L.:  A  Review  of  AmdaG64c,  Com- 
puting Rev.,  vol.  6,  no.  6,  p.  429,  November, 
December,  1965. 


Bibliography  645 


GrimReSc-        Grimsdale,  R.  L.:  A  Review  of  ChenT64,  Com-  H1IIJ66 
piitina  Rci .,  vol.  6,  no.  6,  pp.  429-430,  Novem- 
ber, December,  1955. 

GrosH53         Grosch,  H.  R.  J.:  High  Speed  Arithmetic:  The  HodgD64 
Digital  Computer  as  a  Research  Tool,  /.  Optical 
Society  of  America,  vol.  4,  no.  4,  pp.  306-310, 
April,  1953. 

GrueF68  Gruenberger,  F.  J.:  The  History  of  the  JOHN-  HollJ59 

NIAC,  Mem.  RM-565A-PR.  prepared  for  United 
States  Air  Force  Project  Rand,  The  Rand  Cor- 
poration, Santa  Monica,  Calif..  October,  1968. 

GrumM58        Grumette,  Murray:  IBM  704— Code  Nundrums,  HopkA63 
Comm.  ACM.  vol.  1,  no.  3,  pp.  3-13,  March, 
1958. 

HalnL65  Haines,  L.  H.:  Serial  Compilation  and  the  1401 

FORTRAN  Compiler,  IBM  .Si/.v.  J.,  vol.  4.  no.  1. 
pp.  73-80,  January,  1965.  HowaD61 

HaleA62  Haley,  A.  C.  D.:  The  KDF9  Computer  System, 

AFIPS  Proc.  FJCC.  vol.  22,  pp.  108-120,  1962. 

HambC62        Hamblln,  C.  L.:  Translation  to  and  from  Polish 

Notation,  Computer  J.,  vol.  5,  pp.  210-213,  Octo-  HowaD62 
ber,  1962. 

HaneF68         Haney,  Frederick  M.:  Using  a  Computer  to  De- 
sign Computer  Instruction  Sets,  Thesis  for  Ph.D.  How/aD63 
degree,  Carnegie-Mellon  University,  College  of 
Engineering  and  Science,  Department  of  Com- 
puter Science,  Pittsburgh,  Pa.,  May,  1968.  HughE54 

HartD68  Hartley,  D.  F.,  B.  Landy,  and  R.  M.  Needham: 

The  Structure  of  a  Multiprogramming  Super- 
visor, Computer  ]..  vol.  11,  no.  3,  pp.  247-255.  lverK62 
November,  1968. 

HaucE68  Hauck,  E.  A.,  and  B.  A.  Dent:  Burroughs  B 

6500/B  7500  Stack  Mechanism,  AFIPS  Proc.  JohnD52 

S/CC,  vol.  32,  pp,  245-251,  1968. 
HaueR52         Haueter,  R.  C:  Auxiliary  Equipment  to  SEAC 

Input-Output,  AIt:E-IRF-ACM  Conference,  pp.  KampT60 

39-44,  December,  1952. 

HellH61  Hellerman,  H.:  On  the  Organization  of  a  Multi- 

programming—Multiprocessing System,  IBM 
Res.  Rept.  RC-522,  52  pp..  Yorktown  Hts.,  N.Y.,  KatzJ66 
September,  1961. 

HellH66  Hellerman,  H.:  Parallel  Processing  of  Algebraic 

Expressions,  IEEE  Tram.,  vol.  EC-15,  no.  1,  pp.  KilbT56 
82-91,  February,  1966. 

HerwP60         Herwitz,  Paul  S.,  and  James  H.  Pomerene:  The 
Harvest  System,  Proc.  WJCC.  pp.  23-32,  1960. 


Hillegass,  John  R.:  Auerbach  on  Equipment  IBM 
System  360— The  First  Two  Years,  Data  Proc- 
c.%sinti  Maf^..  vol.  8,  no.  5,  pp.  44-51,  May,  1966. 

Hodges,  Donald:  IPL-VC,  A  Proposal  for  a  Com- 
puter System  Having  the  IPL-V  Instruction  Set, 
ArgoiiiK-  Xatl.  Lab..  Appl.  Math.  Die,  Tech.  Mem. 
66,  22  pp.,  January,  1954. 

Holland,  John:  A  Universal  Computer  Capable 
of  Executing  an  Arbitrary  Number  of  Sub- 
programs Simultaneously,  Proc.  EJCC,  pp.  108- 
113,  1959. 

Hopkins,  A.  L.,  R.  L.  Alonso,  and  H.  Blair-Smith: 
Logical  Description  of  the  Apollo  Guidance 
Computer  (AGO  4),  M  I  T  Instrumentation  Lab., 
Rept.  fi-393  (confidential),  Cambridge,  Mass., 
March,  1963. 

Howarth,  D.  J.,  R.  B.  Payne,  and  F.  H.  Sumner: 
The  Manchester  University  Atlas  Operating 
System,  Part  II:  User's  Description,  Computer ]., 
vol.  4,  no.  3,  pp.  226-229,  October,  1961. 

Howarth,  D.  J.,  P.  D.  Jones,  and  M.  T.  Wyld:  The 
ATLAS  Scheduling  System,  Computer  J.,  vol.  5, 
no.  3,  pp.  238-244,  October,  1962. 

Howarth,  D.  J.:  Experience  with  the  Atlas 
Scheduling  System,  AFIPS  Proc.  S/CC.  vol.  23, 
pp.  59-67,  1953. 

Hughes,  E.  S.,  Jr.:  The  IBM  Magnetic  Drum 
Calculator  Type  650,  Engineering  and  Design 
Considerations.  Proc.  WJCC,  pp.  140-154,  1954. 

Iverson,  Kenneth  E.:  A  Common  Language  for 
Hardware,  Software,  and  Applications,  AFIPS 
Proc.  FJCC.  vol.  22,  pp.  121-129,  1962. 

Johnston,  D.  L.:  Standardized  Printed  Circuit 
Units  for  Digital  Computers,  Proc.  ACM,  Pitts- 
burah  Conf.,  pp.  135-141,  May,  1952. 

Kampe,  Thomas  W.:  The  Design  of  a  General- 
purpose  Microprogram-controlled  Computer 
with  Elementary  Structure.  IRE  Trans.,  vol.  EC-9, 
no.  2,  pp.  208-213,  June,  1950. 

Katz,  J.  H.:  Simulation  of  a  Multiprocessor 
Computing  System,  AFIPS  Proc.  S/CC.  vol.  28, 
pp.  127-139,  1955. 

Kilburn,  T.,  D.  B.  G.  Edwards,  and  C.  E.  Thomas: 
The  Manchester  University  Mark  II  Digital  Com- 
puting Machine,  Proc.  lEE.  Pt.  B,  vol.  103,  Supp. 
2,  pp.  247-268,  1955. 


646  Bibliography 


KilbT60u  Kilburn,  T.,  and  R.  L.  Grimsdale:  A  Digital  Com- 
puter Store  with  a  Very  Short  Read  Time,  Proc. 
lEE.  Pt.  B,  vol.  107,  pp.  567-572,  November, 
1960. 

KilbT60/)  Kilburn,  T.,  D.  B.  G.  Edwards,  and  D.  Aspinall: 
A  Parallel  Arithmetic  Unit  Using  a  Saturated 
Transistor  Fast-Carry  Circuit,  Proc.  lEE.  Pt.  B, 
vol.  107,  pp.  573-584,  November,  1960. 

KilbT61<;  Kilburn,  T.,  D.  J.  Howarth,  R.  B.  Payne,  and  F.  H. 

Sumner:  The  Manchester  University  Atlas  Oper- 
atingSystem,  Part  I:  Internal  Organization,  Com- 
putcr  J.,  vol.  4,  pp.  222-225,  October,  1961. 

KilbT61/)  Kilburn,  T.,  R.  B.  Payne,  and  D.  J.  Howarth:  The 

Atlas  Supervisor,  AFIPS  Proc.  E]CC.  vol.  20,  pp. 
279-294,  1961. 

KilbT62  Kilburn,  T.,  D.  B.  G.  Edwards,  M.  J.  Lanigan, 

and  F.  H.  Sumner:  One-level  Storage  System, 
IRETram..  vol.  EC  U,  no.  2,  pp.  223-235,  April, 
1962. 

KinsH64  Kinslow,  H.  A.:  The  Time-sharing  Monitor  Sys- 

tem, AFIPS,  Proc.  FJCC.  Pt.  I,  vol.  26,  pp.  443- 
454,  1964. 

KistJ57  Kister,  J.,  P.  Stein,  S.  Ulam,  W.  Walden,  and  M. 

Wells:  Experiments  in  Chess,  /.  ACM.  vol.  4,  no. 
2,  pp.  174-177,  April,  1957. 

KitoA56  Kitov,  A.    I.:   Elektronnie  Tsifrovie  Mashiny 

(Electronic  Digital  Machines),  Izdatelstvo 
Sovetskoe  Radio,  Moscow,  partial  translation 
available,  1956. 

KleiR53  Klein,  R.  J.,  Jr.:  The  Oracle  Memory  System, 

Argonne  Natl.  Lab..  Proc.  Si/mp.  on  Larac  Scale 
Digital  Computing  Machines,  pp.  47-58,  August, 
1953. 

KnigK66  Knight,  Kenneth  E.:  Changes  in  Computer  Per- 

formance. Datamation,  vol.  12,  no.  9,  pp.  40-54, 
September,  1966. 

KnigK68  Knight,  Kenneth  E.:  EvolvingComputer  Perform- 

ance 1963-1967,  Datamation.  yo\.  14,  no.  1,  pp. 
31-35,  January,  1968. 

KnutD66  Knuth,  D.  E.:  Additional  Comments  on  a  Prob- 
lem in  Concurrent  Programming  Control, 
Comm.  ACM.  vol.  9,  no.  5,  pp.  321-322,  1966. 

KrogM61  Kroger,  Marlin  G.,  et  al.:  Computers  in  Com- 
mand and  Control,  TR61-12,  prepared  for 
DOD:ARPA  by  Digital  Computer  Application 
Study.  Institute  for  Defense  Analyses.  Research 


and  Engineering  Support  Division,  November, 
1961. 

KuckD68  Kuck,  D.  J.:  ILLIAC  IV  Software  and  Application 
Programming,  IEEE  Tran.^..  vol.  C-17,  no.  8,  pp. 
758-770,  August,  1968. 

LampB65  Lampson,  B.  W.:  Interactive  Machine  Language 
Programming.  AFIPS  Proc.  FJCC.  Pt.  I,  vol.  27, 
pp.  473-481,  1965. 

LampB66  Lampson,  B.  W.,  W.  W.  Lichtenberger,  and 
M.  W.  Pirtle:  A  User  Machine  in  a  Time-sharing 
System,  Proc.  IEEE.  vol.  54.  no.  12.  pp.  1766- 
1774,  December,  1966. 

LangJ67  Langdon,  J.  L.,  and  E.  J.  Van  Derveer:  Design 

of  a  High-speed  Transistor  for  the  ASLT  Current 
Switch,  IBM  J.  o/fiM.  and  Dev..  vol.  11,  no.  1, 
pp.  69-73,  January,  1967. 

LaueH67  Lauer,  Hugh  C:  Bulk  Core  in  a  360/67  Time- 
sharing System,  AFIPS  Proc.  FJCC.  vol.  31,  pp. 
601-609,  1967. 

LebeS56  Lebedev,  S.  A.:  The  High-speed  Calculating  Ma- 

chine of  the  Academy  of  Sciences  of  the  USSR. 
/.  ACM,  vol.  3,  pp.  129-133.  1956. 

LehmM63«  Lehman,  M.,  R.  Eshed,  and  Z.  Netter:  SABRAC, 
A  Time-sharing  Low-cost  Computer,  Comm. 
ACM,  vol.  6,  no.  8,  pp.  427-429,  August,  1963. 

LehmM63/)  Lehman.  M..  R.  Eshed,  and  Z.  Netter:  SABRAC 
—A  New  Generation  Serial  Computer,  IEEE 
Trans.,  vol,  EC-12.  no.  6.  pp.  618-628.  Decem- 
ber, 1963. 

LehmM65  Lehman,  M.:  Serial  Mode  Operation  and  High- 
speed Parallel  Processing,  Proc.  IFIPCong.  1965, 
Pt.  2,  pp.  631-633,  1965. 

LehmM66  Lehman.  M.:  A  Survey  of  Problems  and  Prelimi- 
nary Results  Concerning  Parallel  Processing 
and  Parallel  Processors.  Proc.  IEEE.  vol.  54.  no. 
12.  pp.  1889-1901,  December.  1966. 

LeinA54  Leiner,  A.  L.,  and  S.  N.  Alexander:  System 

Organization  of  the  DYSEAC.  Professional  Group 
on  Electronic  Computers.  Institute  of  Radio  Engi- 
neers, vol.  EC-3,  no.  1.  pp.  1-10.  March.  1954. 

LeinA57  Leiner,  A,  L.,  W.  A.  Notz.  J.  L.  Smith,  and  A. 

Weinberger:  Organizing  a  Network  of  Computers 
to  Meet  Deadlines.  Proc.  EJCC,  pp.  115-128, 
1957. 

LeinA58  Leiner,  A.  L.,  W.  A.  Notz,  J.  L.  Smith,  and  A. 


Bibliography  647 


Weinberger:  PILOT,  The  NBS  Multicomputer 
System,  Froc.  EJCC,  pp.  71-75,  1958. 

LemA59  Leiner,  A.  L.,  W.  A.  Notz,  J.  L.  Smith,  and  A. 

Weinberger;  PILOT,  A  New  Multiple  Computer 
System,  /.  ACM,  vol.  6,  no.  3,  pp.  313-335, 
1959. 

LichW65  LIchtenberger,  W.,  and  M.  W.  Pirtle:  A  Facility 

for  Experimentation  in  Man-Machine  Inter 
action,  MIPS  Proc.  FJCC.  Pt.  I,  vol.  27,  pp 
589-598,  1965. 

LindA66  Lindquist,  A.  B.,  R.  R.  Seeber,  and  L.  W.  Com 

eau:  A  Time-sharing  System  Using  an  Asso 
ciative  Memory,  Pwc.  IEEE.  vol.  54,  no.  12,  pp 
1774-1779,  December,  1966. 

LiptJ68  Liptay,  J.  S.:  Structural  Aspects  of  the  Sys 

tem/360  Model  85,  II.  The  Cache,  IBM  St/s.  J. 
vol.  7,  no.  1,  pp.  15-21,  1968. 

LloyR67  Lloyd,  R.  H.  F.:  ASLT:  An  Extension  of  Hybrid 

Miniaturization  Techniques,  IBM  J.  of  Brs.  and 
Dev.,  vol.  11,  no.  1,  pp.  86-92,  January,  1967. 

LoneW61  Lonergan,  William,  and  Paul  King:  Design  of  the 
B  5000  System,  Dutamatkm,  vol.  7,  no.  5,  pp. 
28-32,  May,  1961. 

LonsK56  Lonsdale,  K.,  and  E.  T.  Warburton:  Mercury:  A 
High  Speed  Digital  Computer,  Pmc.  lEE.  Pt.  B, 
vol.  103,  Supp.  2,  pp.  174-183,  1956. 

LourN59  Lourie,  N.,  H.  Schrimpf,  R.  Reach,  and  W.  Kahn: 

Arithmetic  and  Control  Techniques  in  a  Multi- 
program  Computer,  Proc.  EJCC.  pp.  75-81. 
1959. 

McCaJ62  McCarthy,  J.:  ''Time  Sharing  Computer  Systems 
in  Management  and  the  Computer  of  the  Fu- 
ture," The  M.l.T.  Press,  Cambridge,  Mass., 
1962. 

McCaJ63         McCarthy,  J.,  S.  Boilen,  E.  Fredkm,  and  J.  C. 

R.  Licklider:  A  Time-sharing  Debugging  System 
for  a  Small  Computer,  AFIPS  Pmc.  SJCC,  vol.  23, 
pp.  51-57,  1963. 

McCoB63  McCormick,  Bruce  H.:  The  Illinois  Pattern  Rec- 
ognition Computer-ILLIAC  III.  IEEE  Tram.,  vol. 
EC-12,  no.  5,  pp.  791-813,  December,  1963. 

McCuJ65         McCullough,  J.  D.,  K.  H.  Speierman,  and  F.  W. 

Zurcher:  Design  for  a  Multiple  User  Multiproc- 
essing System,  AFIFS  Proc.  FJCC.  Pt.  I,  vol.  27, 
pp.  611-617,  1965. 

McPhJSl  McPherson,  J.  L.,  and  S.  N.  Alexander:  Per- 


formance of  the  Census  Univac  System,  AIEE- 
IBE  Conf..  pp.  16-22,  December,  1951. 

MaheR61  Maher,  R.  J.:  Problems  of  Storage  Allocation  in 
a  Multiprocessor  Multiprogrammed  System, 
Comni.  .\CM.  vol.  4,  no.  10,  pp.  421-422,  Octo- 
ber, 1961. 

MarcM63         Marcotty,  M.  J.,  F.  M.  Longstaff,  and  A.  P.  M. 

Williams:  Time-sharing  on  the  Ferranti-Packard 
FP6000  Computer  System,  AFIPS  Proc.  SJCC, 
vol.  23.  pp.  29-40,  1963. 

MeadR63  Meade.  R.  M.:  604  Machine  Description,  IBM 
Internal  Man..  38  pp.,  December,  1963. 

MeagR51  Meagher,  R.  E.,  and  J.  P.  Nash:  The  Ordvac, 
AIEE-IHE  Conf.,  pp.  37-43,  December,  1951. 

MelbA65  Melbourne,  A.  J.,  and  J.  M.  Pugmire:  A  Small 

Computer  for  the  Direct  Processing  of  FORTRAN 
Statements,  Computer  J.,  vol.  8,  pp.  24-27,  April, 
1965. 

MendM66  Mendelson,  M.  J.,  and  A.  W.  England:  The  SDS 
SIGMA  7:  A  Real-time,  Time-sharing  Computer, 
AFIPS  Proc.  FJCC,  vol.  29,  pp.  51-64,  1966. 

MercR57  Mercer,  Robert  J.:  Micro-programming,  J.  ACM, 
vol.  4.  no.  2,  pp.  157-171,  1957. 

Merrl56  Merry,  I.  W.,  and  B.  G.  Maudsley:  The  Magnetic- 

drum  Store  of  the  Computer  Pegasus,  Proc.  lEE, 
Pt.  B,  vol.  103,  Supp.  2,  pp.  197-202,  1956. 

MetrN52  Metropolis,  N.,  E.  F.  Klein,  W.  Orvedahl,  J.  R. 

Richardson,  H.  B.  Demuth,  and  J.  B.  Jackson: 
MANIAC,  Proc.  ACM.  Toronto  Conf..  pp.  13-17, 
September,  1952. 

MillW63  Miller,  W.  F.,  and  R.  A.  Aschenbrenner.  The  GUS 

Multicomputer  System.  IEEE  Tran.s..  vol.  EC-12, 
no.  6,  pp.  671-676,  December,  1963. 

MiraW67  Miranker,  W.  L.,  and  W.  M.  Liniger:  Parallel 
Methods  for  the  Numerical  Integration  of  Ordi- 
nary Differential  Equations,  Math,  of  Computa- 
tion, vol.  21,  no.  99,  pp.  303-320,  July.  1967. 

MolnC67  Molnar,  Charles  E.,  Severo  M.  Ornstein,  and 
Antharvedi  Anne;  The  CHASM;  A  Macromodular 
Computer  for  Analyzing  Neuron  Models,  AFIPS 
Proc.  SJCC.  vol.  30,  pp.  393-401,  1967. 

MonnR68  Monnier,  Richard  E.:  A  New  Electronic  Calcula- 
tor with  Computerlike  Capabilities,  Hewlett- 
Packard  J.,  vol.  20,  no.  1,  pp.  3-9,  September, 
1968. 


648  Bibliography 


IVlorrD67  Morris,  Derrick,  Frank  H.  Sumner,  and  IVlichael 
T.  Wyld:  An  Appraisal  of  the  Atlas  Supervisor, 
Proc.  ACM  Natl.  Meeting,,  pp.  67-75,  1967. 

l\/luntC62  Muntz,  C.  A,:  A  List  Processing  Interpreter  for 
AGC4,  M.I.T.,  Imtruinentation  Lab.,  ACC  Mem. 
2,  Cambridge,  Mass.,  January,  1962. 

MurtJ66  Murtha,  J.  C:  Highly  Parallel  Information  Proc- 

essing Systems,  in  "Advances  in  Computers," 
vol.  7,  pp.  2-116,  Academic  Press,  Inc.,  New 
York,  1966. 

MyerT68  Myer,  T.  H.,  and  I.  E.  Sutherland:  On  the  Design 
of  Display  Processors,  Comm.  ACM.  vol.  11,  no. 
6,  pp.  410-414,  June,  1968. 

NeweA56  Newell,  A.,  and  H.  A.  Simon:  The  Logic  Theory 
Machine,  IRE  Tram.,  vol.  IT-2,  no.  3,  pp.  61-79, 
September,  1956. 

NeweA57(i  Newell,  A.,  and  J.  C.  Shaw:  Programming  the 
Logic  Theory  Machine,  Proc.  WJCC,  pp.  230- 
240,  February,  1957. 

NeweA57/;  Newell,  A.,  J.  C.  Shaw,  and  H.  A.  Simon:  Empiri- 
cal Explorations  of  the  Logic  Theory  Machine, 
Proc.  WJCC.  pp.  218-230,  February,  1957. 

NeweA58  Newell,  A.,  J.  C.  Shaw,  and  H.  A.  Simon:  The 
Elements  of  a  Theory  of  Human  Problem  Solv- 
ing, Psijchologi)  Rei.,  vol.  65,  pp.  151-166, 
March,  1958. 

NievJ64  Nievergelt,  J.:  Parallel  Methods  for  Integrating 

Ordinary  Differential  Equations,  Comm.  ACM, 
vol.  7,  no.  12,  pp.  731-733,  December,  1964. 

NiseN66  Nisenoff,  N.:  Hardware  for  Information  Process- 

ing Systems:  Today  and  in  the  Future,  Proc. 
IEEE,  vol.  54,  no.  12,  pp.  1820-1835,  Decem- 
ber, 1966. 

OsboT68  Osborne,  Thomas  E.:  Hardware  Design  of  the 
Model  9100A  Calculator,  Hewlett-Packard  J.,  vol. 
20,  no.  1,  pp.  10-13,  September,  1968. 

OssaJ65  Ossanna,  J.  F.,  L.  E.  Mikus,  and  S.  D.  Dunten: 

Communications  and  Input-Output  Switching  in 
a  Multiplex  Computing  System,  AFIPS  Proc. 
f]CC,  Pt.  I,  vol.  27,  pp.  231-241,  1965. 

PadeA64  Padegs,  A.:  Channel  Design  Considerations 
IBM  ,Si/.s.  /..  vol.  3,  no.  2,  pp.  165-180,  1964 

PadeA68  Padegs,  A.:  Structural  Aspects  of  the  Sys 
tem/360  Model  85,  III.  Extensions  to  Float 
ing-point  Architecture,  IBM  Sys.  }.,  vol.  7,  no 
1,  pp.  22-29,  1968. 


PapiW57  Papian,  W.  N.:  High-speed  Computer  Stores  2.5 
Megabits,  Electronics,  vol.  30,  no.  10,  pp.  162- 
167,  October,  1957. 

PatzW67  Patzer,  William  J.,  and  Gilbert  C.  Vandling:  Sys- 
tems Implications  of  Microprogramming,  Com- 
puter Dcsig,n,  vol.  6,  no.  12,  pp.  62-66,  Decem- 
ber, 1967. 

PeacA??  Peacock,  A.:  Read-only  Memory  and  Computer 

Control,  to  be  published.^ 

PennJ62  Penny,  J.  P.,  and  T.  Pearcey:  Use  of  Multipro- 
gramming in  the  Design  of  a  Low  Cost  Digital 
Computer,  Comm.  ACM,  vol.  5,  no.  9,  pp.  473- 
476,  September,  1962. 

PikeJ52  Pike,  James  L.:  Input-Output  Devices  Used  with 

SEAC,  AIEE-IRE-ACM  Conf..  pp.  36-38,  Decem- 
ber, 1952. 

PlugW61  Plugge,  W.  R.,  and  M.  N.  Perry:  American  Air- 

lines' "SABRE"  Electronic  Reservations  System, 
Proc.  WJCC,  pp.  593-602,  May,  1961. 

PortR60  Porter,  R.  E.:  The  RW-400-A  New  Polymorphic 

Data  System,  Datamation,  vol.  6,  no.  1,  pp. 
8-14,  January/February,  1960. 

RajcJ43  Rajchman,  J.,  Snyder,  and  Rudnick:  RCA  Labo- 

ratories Report,  under  terms  of  OSRD  contract 
OEM-sr-591, 

RandB68  Randell,  B.,  and  C.  J.  Kuehner:  Dynamic  Storage 
Allocation  Systems,  Comm.  ACM,  vol.  11,  no.  5, 
pp.  297-306,  May,  1958. 

RichR55  Richards,  R.  K.:  "Arithmetic  Operations  in  Digi- 

tal Computers"  D.  Van  Nostrand  Company,  Inc., 
Princeton,  N.J.,  1955. 

RobeJ58  Robertson,  J.  E.:  A  New  Class  of  Digital  Division 

Methods,  IRE  Trans.,  vol.  EC-7,  no.  3,  pp.  218- 
222,  September,  1958. 

RobeL67  Roberts,  Lawrence  G.:  Multiple  Computer  Net- 
works and  Intercomputer  Communication,  ACM 
Symp.  on  Operating  System  Principles,  Gatlinburg, 
Tenn.,  Oct.  1-4,  1967. 

RoseG67  Rose,  Gordon  A.:  "Intergraphic,"  A  Micropro- 
grammed Graphical-Interface  Computer,  IEEE 
Trans.,  vol.  EC-16,  no.  6,  pp.  773-784,  Decem- 
ber, 1967. 

'According  to  E,  F.  Codd.  this  article  has  not  been  published  as  of  Jan.  23, 
1968.  However,  "Microprogram  Control  for  System/360"  by  S.  G.  Tucker,  IBM 
.Si/.s.  vol.  6,  no.  4,  1967,  has  and  covers  the  matenal  that  vie  think  v^as 
intended  to  be  in  PeacA??. 


Il 


Bibliography  649 


RoseJ65  Rosenfeld,  J.:  Marbles  and  Boxes,  IBM  Res. 

Project  Rept..  Yorktown  Hts.,  N.Y.,  November, 
1965. 

RoseS67  Rosen,  Saul:  "Programming  Systems  and  Lan- 

guages," McGraw-Hill  Book  Company,  New 
York,  1967. 

RoseS69  Rosen,  Saul:  Electronic  Computers:  A  Historical 

Survey,  Coinputini^  Sunei/s,  vol.  1,  no.  1,  pp. 
7-36,  March,  1969. 

RosiR69  Rosin,  Robert  F.:  Contemporary  Concepts  of 

Microprogramming  and  Emulation,  Compntin<i 
Suneys,  vol.  1,  no.  4,  pp.  197-212,  December, 
1969. 

RossH53  Ross,  Harold  D.,  Jr.:  The  Arithmetic  Element  of 
the  IBM  Type  701  Computer,  Proe.  IRE.  vol.  41, 
no.  10,  pp.  1287-1294,  October,  1953. 

RothS59  Rothman,  S.:  R/W  40  Data  Processing  System, 

Intern.  Conf.  on  Infonnatiun  Froces.sirig  and 
Auto-math  1959.  Ramo-Wooldridge,  Div.  of 
Thompson  Ramo  Wooldridge,  Inc.,  Los  Angeles, 
Calif.,  June,  1959. 

SaltJ66  Saltzer,  J.  H.:  Traffic  Control  in  a  Multiplexed 

Computer  System,  .\;./.7:  Tech.  Rept.  MAC-TR-30, 
July,  1966. 

SamuA57  Samuel,  Arthur  L.:  Computers  with  European 
Accents,  I'ror.  WJCC.  pp.  14-17.  1957. 

SaxoJ63  Saxon,  J.  A.:  "Programming  the  IBM  7090," 

Prentice-Hall,  Inc.,  Englewood  Cliffs,  N.J.,  1963. 

SchlH??  Schlaeppi,  H.  P.:  Extensions  of  PL/l  like  Lan- 

guages for  Parallel  Processing,  with  Program- 
ming Examples,  in  preparation. 

SchwJ64  Schwartz,  J.  I.:  A  General-purpose  Time  sharing 
System,  AF/PS  Proc.  SJCC,  vol.  25,  pp.  397-411, 
1964. 

SechR67  Sechler,  R.  F.,  A.  R.  Strube,  and  J.  R.  Turnbull: 

ASLT  Circuit  Design,  IBM},  of  Res.  and  Dei.,  vol. 
11,  no.  1,  pp.  74-85,  January,  1967. 

SeebR63  Seeber,  R.  R.,  and  A.  B.  Lindquist:  Associative 
Logic  for  Highly  Parallel  Systems,  AFIPS  Proc. 
FJCC.  vol.  24,  pp.  489-493,  1963. 

SegaR61  Segal,  R.  J.,  and  H.  P.  Guerber:  Four  Advanced 

Computers— Key  to  Air  Force  Digital  Data  Com- 
munication System,  AFIPS  Proe.  EJCC.  vol.  20, 
pp.  264-278,  1961. 

SenzD65         Senzig,  D.  N.,  and  R.  V.  Smith:  Computer  Orga- 


nization for  Array  Processing,  .AF/PS  Proc.  FJCC. 
Pt.  I,  vol.  27,  pp.  117-128,  1965. 

SerrR62  Serrell,  R.,  M.  M.  Astrahan,  G.  W.  Patterson,  and 

I.  B.  Pyne:  The  Evolution  of  Computing  Ma- 
chines and  Systems,  Proe.  IRE.  vol.  50,  no.  5, 
pp.  1039-1058,  May,  1962. 

ShanC38  Shannon,  E.  C:  A  Symbolic  Analysis  of  Relay 
and  Switching  Circuits,  Trans.  AIEE,  vol.  57,  pp 
713-723,  1938. 

SharW69  Sharpe,  William  F.:  "The  Economics  of  Com 
puters,"  Columbia  University  Press,  New  York 
1969. 

ShawJ58  Shaw.  J.  C,  A.  Newell,  H.  A.  Simon,  and  T.  0, 
Ellis:  A  Command  Structure  for  Complex  Infor 
mation  Processing,  Proe.  WJCC.  pp.  119-128 
1958. 

ShedG66«  Shedler,  G.  S.,  and  M.  Lehman:  Parallel  Compu 
tation  and  the  Solution  of  Polynomial  Equa 
tions,  IBM  Res.  Rept.  1550,  Yorktown  Hts.,  N.Y. 
February,  1966. 

ShedG66/)  Shedler,  G.  S.:  Parallel  Numerical  Methods  for 
the  Solution  of  Equations,  IBM  Res.  Rept.  RC 
1619,  Yorktown  Hts.,  N.Y.,  June,  1966. 

ShupP53  Shupe,  P.  D.,  and  R.  A.  Kirsch:  SEAC,  Review 

of  Three  Years  of  Operation,  Proc.  EJCC.  pp. 
83-90,  1953. 

SlotD62  Slotnick,  Daniel  L.,  W.  Carl  Borck,  and  Robert 

C.  McReynolds:  The  SOLOMON  Computer, 
AFIPS  Proe.  FJCC,  vol.  22,  pp.  97-107,  1962. 

SlutRSl  Slutz,  Ralph  J.:  Engineering  Experience  with  the 

SEAC,  AIEE-IRE  Conf..  pp.  90-94,  December, 
1951. 

SmitR64  Smith,  R.  V.,  and  D.  N.  Senzig:  Computer  Orga- 

nization for  Array  Processing,  IBM  Res.  Rept.  RC 
1330,  Yorktown  Hts.,  N.Y.,  December,  1964. 

SoloM56  Solomon,  Martin  B..  Jr.:  Economies  of  Scale  and 

the  IBM  System/360,  Comm.  ACM.  vol.  9,  no. 
6,  pp.  435-440,  June,  1966. 

SquiJ63  Squire,  J.  S.,  and  S.  M.  Polais:  Programming 

and  Design  Considerations  of  a  Highly  Parallel 
Computer,  AFIPS  Proe.  SJCC.  vol.  23,  pp.  395- 
400.  1963. 

SteeT61  Steel,  T.  B.,  Jr.:  A  First  Version  of  UNCOL,  Proc. 

WJCC.  pp.  371-377,  1961. 

StevL52  Stevens,  L.  D.:  Engineering  Organization  of 

Input  and  Output  for  the  IBM  701  Electronic 


650  Bibliography 


Data-processing  Machine,  AIEE-IRE-ACM 
Conf.,  pp.  81-85,  December,  1952. 

StevW64  Stevens,  W.  Y.:  The  Structure  of  System/360, 
Part  II— System  Implementations,  IBM  Sys.  ]., 
vol.  3,  no.  2,  pp.  136-143,  1964. 

StraC59  Strachey,  C:  Time  Sharing  in  Large  Fast  Com- 

puters, Proc.  ICIP,  UNESCO,  pp.  336-341,  June, 
1959. 

SumnF62  Sumner,  F.  H.,  G.  Haley,  and  E.  C.  Y.  Chen:  The 
Central  Control  Unit  of  the  "Atlas"  Computer, 
Proc.  IFIP  Cong.  1962,  pp.  657-662,  1962. 

TaylN51  Taylor,  Norman  H.:  Evaluation  of  the  Engineer- 

ing Aspects  of  Whirlwind  1,  AIEE-IRE  Conf.,  pp. 
75-78,  December,  1951. 

TeagH65  Teager,  Herbert  M.:  A  Review  of  AmdaG64a; 

Computing,  Rev.,  vol.  6,  no.  5,  pp.  355-356, 
September-October,  1965. 

ThomR63  Thompson,  R.  N.,  and  J.  A.  Wilkinson:  The  D825 
Automatic  Operating  and  Scheduling  Program, 
AFIPS  Proc.  S/CC,  vol.  23,  pp.  41-49,  1963. 

ThorJ64  Thornton,  James  E.:  Parallel  Operation  in  the 

Control  Data  6600,  AFIPS  Proc.  FJCC,  Ft.  II,  vol. 
26,  pp.  33-40,  1964. 

TomaR67  Tomasulo,  R.  M.:  An  Efficient  Algorithm  for 
Exploiting  Multiple  Arithmetic  Units,  IBM  }.  of 
Res.  and  Dev.,  vol.  11,  no.  1,  pp.  25-33,  January, 
1967. 

TuckS67  Tucker,  S.  G.:  Microprogram  Control  for  Sys- 

tem/360, IBM  Sip.  ]..  vol.  6,  no.  4,  pp.  222-241, 
1967. 

TuriS59  Turing,  Sara:  "Alan  M.  Turing,"  W.  Heffer  and 

Sons,  Ltd.,  Cambridge,  England,  1959. 

UngeS58  Unger,  S.  H.:  A  Computer  Oriented  toward  Spa- 
tial Problems,  Proc.  IRE,  vol.  46,  no.  10,  pp. 
1744-1750,  October,  1958. 

VandW52  Van  der  Poel,  W.  L.:  A  Simple  Electronic  Digital 
Computer,  Appl.  Sci.  Res.,  Sec.  B,  vol.  2,  pp. 
367-400,  1952. 

VandW56  Van  der  Poel,  W.  L.:  The  Logical  Principles  of 
Some  Simple  Computers,  Thesis,  Amsterdam, 
1956. 

VandW59  Van  der  Poel,  W.  L.:  ZEBRA,  A  Simple  Binary 
Computer,  Proc.  ICIP,  UNESCO,  pp.  361-365, 
June,  1959. 


VyssV65  Vyssotsky,  V.  A.,  F.  J.  Corbato,  and  R.  M.  Gra- 

ham: Structure  of  the  Multics  Supervisor,  AFIPS 
Proc.  FJCC,  Pt.  I,  vol.  27,  pp.  203-212,  1965. 

WaleE62  Walendziewicz,  E.  T.:  The  D210  Magnetic  Com- 

puter, Proc.  Conf.  on  Spaceborne  Computer  Engi- 
neering, Anaheim,  Calif.,  pp.  117-127,  Oct. 
30-31,  1962. 

WareW63a  Ware,  W.  H.:  "Digital  Computer  Technology  and 
Design,"  vol.  1,  "Mathematical  Topics,  Princi- 
ples of  Operation,  and  Programming,"  John 
Wiley  &  Sons,  Inc.,  New  York,  1963. 

WareW63b  Ware,  W.  H.:  "Digital  Computer  Technology  and 
Design,"  vol.  2,  "Circuits  and  Machine  Design," 
John  Wiley  &  Sons,  Inc.,  New  York,  1963. 

WebeH67  Weber,  Helmut:  A  Microprogrammed  Implemen- 
tation of  EULER  on  IBM  System/360  Model  30, 
Comm.  ACM,  vol.  10,  no.  9,  pp.  549-558,  Sep- 
tember, 1967. 

WeikM55  Weik,  M.  H.:  A  Survey  of  Domestic  Electronic 
Digital  Computing  Systems,  B«WishV-  Research 
Laboratories,  Aberdeen,  Md.,  Rcpt.  971,  Decem- 
ber, 1955. 

WeikM61  Weik,  Martin  H.:  A  Third  Survey  of  Domestic 
Electronic  Digital  Computing  Systems,  Ballistic 
Research  Laboratories,  Aberdeen,  Md.;  report 
supersedes  BRL  Rcpt.  1010,  Department  of  the 
Army  Project  No.  5B03-06-002  (1961). 

WeikM64  Weik,  Martin  H.,  Jr.:  A  Fourth  Survey  of  Do- 
mestic Electronic  Digital  Computer  Systems, 
Ballistic  Research  Laboratories.  Aberdeen,  Md., 
Rept.  1227;  processed  by  Defense  Documenta- 
tion Agency,  Defense  Supply  Agency  No.  42900, 
January,  1964. 

WestCeO  West,  George  P.,  and  Ralph  J.  Koerner:  Com- 
munications within  a  Polymorphic  Intellectronic 
System,  Proc.  WJCC,  pp.  225-230,  1960. 

WilkJ53  Wilkinson,  J.  H.:  "The  Pilot  ACE,"  pp.  5-14, 

Automatic  Digital  Computation,  National  Physi- 
cal Laboratory,  Teddington,  England,  March 
25-28,  1953. 

WilkM51</  Wilkes,  M.  V.:  The  Best  Way  to  Design  An  Auto- 
matic Calculating  Machine,  Manchester  Univer- 
sity Computer  Inaugural  Conf,  July,  1951.  Pub- 
lished by  Ferranti  Ltd.,  London. 

WilkM51/;  Wilkes,  M.  V.:  The  Edsac  Computer,  AIEE-IRE 
Conf.,  pp.  79-83,  December,  1951. 


Bibliography  651 


WilkM52  Wilkes,  M,  V.,  D.  J.  Wheeler,  and  S.  Gill:  "The 

Preparation  of  Programs  for  a  Digital  Compu- 
ter," Addison-Wesley  Publishing  Company,  Inc., 
Reading,  Mass.,  1952. 

WilkM53  Wilkes,   M.  V.,  and  J.   B.  Stringer:  Micro- 

programming and  the  Design  of  the  Control 
Circuits  in  an  Electronic  Digital  Computer,  Pror. 
Cambridge  Phil.  Soc.  Ft.  2,  vol.  49,  pp.  230-238, 
April,  1953. 

WilkM58(;  Wilkes,  M.  V.,  W.  Renwick,  and  D.  J.  Wheeler: 
The  Design  of  the  Control  Unit  of  an  Electronic 
Digital  Computer,  Proc.  lEE.  Pt.  B,  vol.  105,  pp. 
121-128,  March,  1958. 

WilkM58/)  Wilkes,  M.  V.:  Microprogramming,  Proc  EJCC. 
pp.  18-20,  1958. 

WilkM65  Wilkes,  M.  V.:  Slave  Memories  and  Dynamic 

Storage  Allocation,  IEEE  Trans.,  vol.  EC- 14,  no. 
2,  pp.  270-271,  1965. 

WilkM69  Wilkes,  M.  V.:  The  Growth  of  Interest  in  Micro- 
programming: A  Literature  Survey,  Computing 
Suncy.s,  vol.  1,  no.  3,  pp.  139-145,  September, 
1969.' 

WillC53  Williams,  Charles  R.:  A  Review  of  ORDVAC 


Operating  Experience,  Proc.  EJCC,  pp.  91-95, 
1953. 

WillF49  Williams,  F.  C,  and  T.  Kilburn:  A  Storage  System 

for  Use  with  Binary-Digital  Computing  Ma- 
chines, Proc.  lEE.  Pt.  3,  vol.  96,  pp.  81-100, 
March,  1949.  Same  paper  in  Pt.  2,  vol.  96,  pp. 
183-202,  April,  1949. 

WirsJ56  Wirsching,  Joseph  E.:  NOVA:  A  List-oriented 

Computer,  Datamation,  vol.  12,  no.  12,  pp. 
41-43,  December,  1966. 

WirtN56(;  Wirth,  N.,  and  H.  Weber:  EULER:  A  Generaliza- 
tion of  ALGOL,  and  Its  Formal  Definition:  Part 

I,  Comm.  ACM.  vol.  9,  no.  1,  pp.  13-25,  Janu- 
ary, 1966. 

WirtN66/)  Wirth,  N.,  and  H.  Weber:  EULER:  A  Generaliza- 
tion of  ALGOL,  and  Its  Formal  Definition:  Part 

II,  Comm.  ACM.  vol.  9,  no.  2,  pp.  89-99,  Febru- 
ary, 1966. 

WirtN66c  Wirth,  N.:  A  Note  on  "Program  Structures"  for 
Parallel  Processing,  Comm.  ACM.  vol.  9,  no.  5, 
pp.  320-321,  May,  1966. 

ZadeL63  Zadeh,  Lotfi  A.,  and  Charles  A.  Desoer:  "Linear 

System  Theory,"  McGraw-Hill  Book  Company, 
New  York,  1963. 


I 


It 


Adams,  Charles  \V.,  42,  58.5 
Adams  Associates,  42,  257,  580 
Ainsworth,  Emest,  212 
Alexander,  S.  N.,  165,  212 
Allard,  R.  W.,  496 
Allen,  M.  W.,  469 
Allmark,  R.  H.,  257,  262-266 
Alonso,  R.  L.,  146-156 
Amdahl,  Gene  M.,  259,  469,  561 
Anderson,  D.  W.,  587 

Anderson,  James  P.,  257,  34S,  447-455,  469, 
586 

Anderson,  S.  F.,  587 
Ann^,  Antharvedi,  73 
Arbuckle,  R.  A.,  .5() 
Arbuckle,  T.,  349 

Arden,  B.  W.,  81,  275,  469.  566,  571 
Aschenbrenner,  R.  A.,  469 
Aspinall,  D.,  277 

Astrahan,  M.  M.,  42,  119,  144,  212,  223,  515 


Babbage,  Charles,  46 
Backus.  John,  9 
Baldwin,  F.  R.,  46 
Baldwin,  R.  R.,  469 
Barnes,  George  H..  32()-:i33 
Barllett,  K.  A.,  .504 
Barton,  R.  S.,  257,  273 
Bashkow,  Theodore  R.,  .363-.381 
Basilewskii,  hi.  la.,  213 
Beckman,  F.  S.,  146 
Belskv,  M.  A.,  .349 
Benington,  H.  D.,  504 
Bernstein,  A,,  349 
Bhushan,  A.,  507 
Bibb,  J..  469 

Blaauw,  G.  A.,  2,59,  426,  428,  464,  .561, 

588-601 
Blair-Smith.  H..  146-1.56 
Bloch.  Erich,  421-439 
Blosk,  R.  T.  4.39 
Bock,  R.  v.,  257 
Boilen,  S.,  291 
Boland,  L.  J.,  .587 
Borck,  W.  Carl,  320,  463 
Bouchon,  Falcon,  Jacques,  46 
Boutwell,  E.,  Jr.,  334 
Bowden,  B.  V.,  42 
Bright,  H.  S.,  291,  456 
Brooker.  R.  A..  279 

Brooks,  F.  P.,  Jr.,  146.  259,  .349,  423,  428,  464, 

,561,  588-601 
Brown,  J.  L.,  385 
Brown,  Richard  M.,  .320-.333 
Buchholz,  Werner,  396.  421.  428.  469,  515 


Burdette,  E.  W.,  119 
Burks,  Arthur  W.,  86-119 
Bussell,  B.,  469 


Campbell,  Robert  V.  D.,  42 
Carlson,  C.  B.,  2.57,  273 
Carpenter,  H.  G.,  171 
Carr,  J.  W.,  Ill,  205-215,  220-224 
Carter,  W,  C,  587 
Casale,  Charles  T.,  69,  155,  156,  396 
Chase,  George  C,  42 
Chen,  E.  C.  Y.,  274 
Chen.  T.  C,  587 
Chu,  J.  C,  119,  .396 
Clark,  Wesley  A.,  274 
Clayton,  B.  B.,  496 
Cochran,  David  S.,  24.3-256.  4.39 
Codd,  E.  F.,  .397,  439,  469 
Comeau,  L.  W.,  .587 
Comfort,  W.  T.,  291,  469 
Conti,  Carl  J.,  563,  .574 
Conway,  Melvin  E.,  295,  457 
Corbato,  Fernando  J.,  295,  457,  469,  517,  .52.3, 
571 

Couleur,  J.,  469 
Cox,  Jerome  R..  Jr.,  50 
Crawford,  P.,  Ill 
Cray,  Seymour,  471 
Critchlow,  A.  J..  469 
Culler.  Glen.  45 


Daley.  Robert  C.  275.  297.  469.  517,  .52.3,  .571 

Darringer,  John  A..  13 

Davies.  D.  W.,  .504 

Davies,  P.  M.  469 

Davis,  G.  M.,  257 

Dean,  R.  F.,  .340.  587 

Demuth,  H.  B..  119 

Dennis.  Jack  B.,  81.  275,  295,  457,  469 

Dent,  B.  A.,  257 

Desmonde,  W.  H.,  456 

Desoer,  Charles  A.,  7 

Devonald.  C.  H.,  171-183 

Dijkstra.  E.  W.,  469 

Doodv.  D.  T..  385 

Dorff.'  E.  K.,  496 

Dreyfus,  P.,  456 

Dunten,  S.  D.,  469 

Dunwell.  S.  W..  421 


Earle,  J.  G.,  587 
Eccles,  W.  H.,  46 

Eckert.  J.  Presper.  Jr..  91,  157-169,  396 


Name  Index 


Edwards,  D.  B.  G.,  276-290 
Elboume,  R.  D.,  172,  212 
Elliott.  W.  S.,  171-18.3 
Ellis,  T.  O.,  257,  349-362 
England,  A.  W.,  396 
England,  W.  A.,  149 
Ernst,  H.  A.,  469 
Eshed,  R.,  469 
Eslrin,  Gerald,  119,  469 
Evans,  D.  S.,  171 
Everett,  R.  R„  137-145,  504 
Ewing,  R.  G.,  469 


Fagen,  R.  E.,  496 
Fagg,  P.,  .385 

Fairclough,  J.  W.,  171,  174,  176,  385 
Falkoff.  A.  D.,  13.  4.58.  .587 
Fikes,  Richard  E.,  .571 
Flvnn,  Michael  J.,  83.  .340,  587 
Forgie,  James  W.,  291,  469 
Forrester,  J.  W.,  75 
Fotheringham.  John,  190 
Frankovich,  J.  M.,  469 
Fredkin,  E.,  291 
Fried,  45 

Frizzell,  Clarence  E.,  .525 


Caller,  B.  A.,  81,  275,  469,  .566,  571 
Gibson,  C.  T.,  81,  587 
Gibson,  D.  H.,  574 
Gibson,  W.  B.,  469 
Gill,  S.,  456 
Glaser,  E.  L..  469 
Goldschmidt,  R.  E.,  .587 
Goldstine,  Herman  H.,  87-119 
Grabbe.  E.  M.  205-215.  220-224 
Graham.  R.  M.,  469 
Granito,  G.  D.,  587 
Green,  A.,  156 
Green,  J.,  .392 
Greene,  J.,  .340,  .587 
Greenstadt.  J.  L..  525 
Greenwald.  Sidney.  212 
Gregory.  J.  G..  315,  463 
Grimsdale,  R.  L.,  277,  587 
Grosch.  H.  R.  J.,  585 
Gruenberger,  F.  J..  89,  119 
Grrmiette.  Murray.  525 
Guerber,  H.  P.,  509 


Haines,  L.  H.,  .392 
Haley,  A.  C.  D.,  266 
Halev,  G.,  274 


653 


654  Name  index 


Hamblin,  C.  L.,  257 

Hanev,  Frederick  M..  9 

Hartley,  D.  F.,  290 

Hauck'.  E.  A..  257 

Haueter,  R,  C.  212 

Hayata.  Tomo,  344 

Hellerman,  H.,  469 

Herwitz,  Paul  S.,  397 

Hillegass,  John  R.,  587 

Hipp,  J.  A.,  385 

Hodges,  Donald,  257 

Hoffman,  Samuel  A.,  257,  447-455,  469 

Holland,  John,  315,  320 

Hollerith.  H.,  46 

Hopkins,  A.  L.,  146-156,  349 

Hoskinson,  E.  A.,  334 

Hovvarth,  D.  J.,  274 

Hughes,  E.  S.,  Jr.,  223 

Huskey,  H.  D..  191,  193 


Iverson,  Kenneth  E.,  13,  587 


Jackson,  J.  B..  119 
Jacquard,  Joseph  Marie,  46 
Johnston,  A.  St.,  171 
Johnston,  D.  L.,  171 
Jones,  P.  D.,  290 
Jordan  F.  W.,  46 


Kahn,  W.,  469 

Kampe,  Thomas  W.,  71,  334,  341-347 

Kato,  Maso,  320-3.33 

Katz.  J.  H.,  463 

Kepler,  Johannes,  46 

Kilburn,  T.,  75,  274-290 

King,  Paul,  257,  267-273 

Kinslow,  H.  A.,  469 

Kirsch,  R.  A.,  212 

Kister,  J.,  .349 

Kitov,  A.  I.,  213 

Klein,  E.  F.,  119 

Klein,  R.  J.,  Jr.,  119 

Knight,  Kenneth  E.,  50-51 

Knuth,  D.  E.,  469 

Koenier,  Ralph  J.,  485 

Kroger,  Marlin  G.,  448 

Kronfeld,  Arnold,  363-381 

Kuck,  David  J.,  .320-3.33 

Kuehner,  C.  J.,  77,  274 


Lampson,  B.  W.,  291-.300 
Landy,  B.,  290 
Langdon,  J.  L.,  581 
Lanigan,  M.  J.,  276-290 
Laning,  J.  H.,  Jr.,  146-156 
Lauer,  Hugh  C,  571 
Lawless,  W.  J.,  Jr.,  146 


Lebedev,  S.  A.,  213 

Lehman,  M.,  393,  446,  456-469 

Leibniz,  Gottfried  Wilhelm.  46 

Leiner.  A.  L.,  212,  440-445,  449,  456 

Lichtenberger.  W.  \V.,  291-300 

Licklider,  J.  C.  R..  291 

Lindquist,  A.  B..  469,  587 

Liniger.  \V.  M.,  463 

Liptav,  J.  S..  587 

Lloyd,  R.  H.  F.,  587 

Lonergan,  William,  257,  267-273 

Longstaff,  F.  M.,  469 

Lonsdale,  K.,  279 

Loiirie,  N.,  469 

Low,  P.  R.,  587 

Lowrv,  E.  S.,  397 

Lucking,  J.  R.,  257,  262-266 

Lukasiewicz,  J.,  270 


McCarthy,  J.,  291,  469 

McCormick,  Bruce  H.,  315 

McCullough,  J.  D.,  291,  456,  469 

McDonough,  E.,  397 

MacLaren,       Donald,  587 

McPherson,  J.  L.,  165,  169 

McRevnolds,  Robert  C,  315,  .320,  463 

Maher,  R.  J.,  273 

Marcotte,  A.  U.,  587 

Marcotty,  M.  J.,  469 

Mauchlv,  John  W.,  91 

Maudsley,  B.  G.,  171-183 

Mauer,  H..  156 

Meade,  R.  M.,  469 

Meagher,  R.  E.,  119 

Melbourne,  A.  J.,  392 

Mendelson,  M.  J.,  .396 

Mercer,  Robert  J.,  .340 

Merry,  I.  W.,  171,  176 

Merwin-Daggett,  Marjorie,  469,  517,  .523,  57 
Messina,  B.  U.,  587 
Metropolis,  N.,  119 
Mikus,  L.  E.,  469 
Miller.  W.  F..  469 
Miranker,  W.  L.,  463 
Mitchell,  Herbert  F.,  1.57-169 
Molnar,  Charles  E.,  73 
Monnier,  Richard  E..  24.3-256 
Montgomery,  H.  C,  587 
Morris,  Derrick,  274 
Mueller,  46 

Muntz,  C.  A.,  155,  156 
Murtha,  J.  C,  320 
Myer,  T.  H.,  .303 


Nash,  J.  P.,  119 
Naur,  Peter,  9 
Needham,  R.  M.,  290 
Netter,  Z.,  469 


Neumann,  P.  G.,  297,  349 
Newell,  A.,  257,  349-362 
Nievergelt,  J.,  463 
Nisenoff,  N.,  42 

Notz,  W.  A.,  440-445,  449.  456 


O'Brien,  T.  G.,  81,  275,  469,  566,  .571 

Oleksiak,  R.,  156 

Oliver,  G.,  469 

Ornstein.  Severe  M.,  73 

Orvedahl,  W.,  119 

Osborne,  Thomas  E.,  24.3-256 

Ossanna,  J.  F.,  469 

Owen,  C.  E.,  171-183 


Padegs,  A.,  587 
Papian,  W.  N.,  279 
Fames,  David  L.,  13 
Pascal,  Blaise.  46 

Patterson,  G.  W.,  42.  119,  144,  212,  223 

Patzer,  William  J.,  .340 

Pa\'ne,  R.  B.,  274 

Peacock,  A.,  604 

Pearcey,  T,  469 

Penny,' J.  P.,  469 

Perry-,  M.  N.,  .5f)4 

Peterson,  H.  P.,  469 

Pike,  James  L.,  212 

Pirtle,  M.  W.,  291-300 

Pitkowsk-y,  S.  H.,  574 

Plugge,  \V.  R.,  .504 

Polais,  S.  M.,  469 

Poland,  C.  B.,  469 

Pomerene,  James  H.,  397 

Porter,  R.  E.,  449,  477-488 

Powers,  D.  M.,  587 

Preiss,  R.  J.,  587 

Pugmire,  J.  M.,  .392 

Pyne,  I.  B.,  42,  119,  144.  212,  22.3 


Rajchman,  J..  Ill 

Ramo,  S.,  20.5-215,  220-224 

Randell,  B.,  77,  274 

Reach,  R.,  469 

Reinheimer,  H.  J.,  587 

Ren  wick,  W.,  346 

Richards.  R.  K.,  146,  150 

Richardson.  J.  R.,  119 

Robbins,  R.  C.  171 

Roberts,  Lawrence  G.,  45,  504 

Roberts,  M.  De  V.,  .349 

Robertson.  J.  E.,  431 

Rochester,  N.,  515 

Rose,  Gordon  A.,  304,  469 

Rosen,  Saul,  3,  42 

Rosenfeld,  J.,  468 

Rosin,  Robert  F.,  340,  649 


Name  index  655 


Ross,  Harold  D.,  Jr.,  525 
Rothman,  S.,  470,  485 
Rudnick,  111,  119 


Saltzer,  J.  H.,  295 

Samuel,  Arthur  L.,  42,  119,  144,  257 

.Sanderson,  J.  C,  469 

Sasson,  Azra,  .'36.3-3S1 

Saxon,  J.  A.,  525 

Scalzi,  C.  A.,  .397 

Scantlebury,  R.  A.,  504 

Schickhardt,  Wilhelm,  46 

Schlaeppi,  H.  P.,  4.57,  463 

Schmitt,  \V.  J.,  .396 

Schrinipf,  H.,  469 

Schwartz,  ].  I.,  291 

Scott,  N.  R.,  209 

Sechler,  R.  F.,  5S7 

Seeber,  R.  R.,  469,  587 

Segal,  R.  J.,  .50f) 

Senzig,  D.  N.,  463,  469 

Serrell,  R.,  42,  119,  144,  212,  223 

Shannon,  E.  C,  46,  649 

Sharpe,  William  F.,  585 

Shaw,  J.  C,  257,  349-362 

Shedler,  G.  S..  463 

Shifrnan,  Joseph,  257,  447-4.55,  469 

Shupe,  P.  D.,  212 

Simon,  II.  A.,  2.57,  349-362 

Slotnick,  Daniel  L.,  315,  320-a3;3,  4&3 

Slutz,  Ralph  J.,  210 

Smith,  J.  L.,  440-445,  449,  4.56 

Smith,  J.  W.,  587 

Smith,  R.  v.,  463,  469 

Snyder,  111,  119 

Solomon,  Martin  B.,  Jr.,  .561 

Sparacio,  F.  J.,  587 

Speierman,  K.  H.,  291,  456,  469 

Squire,  J.  S.,  469 

Steel,  T.  B.,  Jr.,  8 


Stein,  P.,  .349 

Stevens,  L.  D.,  525 

Stevens,  W.  Y.,  563,  587,  602-606 

Stokes,  Richard  A.,  .320-.-3.33 

Stotz,  R.  H.,  507 

Strachev,  C,  469 

Stringer,  J.  B.,  200,  33.5-.340,  .344 

Strube,  A.  R.,  587 

Sumner,  Frank  H.,  274-290 

Sussenguth,  E.  H.,  13,  587 

Sutherland,  I.  E.,  :303 


Taub,  A.  H.,  92 
Taylor,  Norman  H.,  144 
Teager,  Herbert  M.,  587 
Thomas,  C.  E.,  279 
Thomas,  L.  X.,  46 
Thompson,  R.  N.,  455 
Thornton,  James  E.,  489-503 
Tomasulo,  R.  .587 
Tonik,  A.  B.,  .396 
Tucker,  S.  G.,  .340 
Turing,  Alan  M.,  23,  191,  193 
Turing,  Sara,  191,  199 
Turn,  R.,  469 
Turnbull.  J.  R..  .587 


Ulam,  S.,  349 
Unger,  S.  H.,  .320 
Updike,  B.  M.,  .340,  587 


Van  der  Poel,  W.  L.,  200-204 
Van  Der\eer.  E.  J.,  587 
Vandling.  Gilbert  C,  340 
Van  Horn,  E.  C.,  295,  457 
Vareha,  .\lbin  L.,  Jr.,  571 


von  Neumann,  John,  86-119 
Vyssotsky,  V.  A.,  295,  457,  469 


Walden,  \V.,  .349 
Walendziewicz,  E.  T.,  148,  156 
Warburton,  E.  T,  279 
Ward,  J.  E.,  507 
W^are,  W.  H.,  650 

W^eber,  Helmut,  257,  .340,  .348,  .382-392,  469, 
587 

Weik,  Martin  H.,  Jr.,  42 

Weinberger,  A.,  440-445,  449,  456 

Weiner,  James  R.,  1.57-169 

Wells,  M.,  349 

Welsh,  H.  Frazer,  157-169 

West,  George  P.,  485 

Westervelt,  F.  H.,  81,  275,  469.  .566,  .571 

Wheeler,  D.  J.,  346 

W^ilkes,  M.  V.,  84,  139,  200,  214,  334-340, 

.344,  ,345,  396,  574 
Wilkinson,  J.  A.,  4.55 
Wilkinson,  J.  H.,  19.3-199 
Wilkinson,  P.  T,  .504 
Williams,  A.  P.  M.,  469 
Williams,  Charles  R.,  119 
W'illiams,  F.  C.,  75 

W  illiams,  Robert  J.,  257,  447-455,  469 
Wirsching,  Joseph  E.,  316-319 
Wirth,  N.,  257,  .348.  .383,  .389,  392,  469 
Witt,  R.  P.,  172.  212 
Wolf.  K.  A.,  496 

Wooldridge,  D.  E.,  205-215,  220-224 
Wright,  M.  v.,  .349 
\\'vld,  Michael  T.,  274 


Zadeh,  Lotfi  A.,  7 
Zemlin,  R.  A.,  496 
Zraket,  C.  A.,  504 
Zurcher,  F.  W.,  291,  456,  469 


Machine  and  Organization  Index 


Page  references  in  boldface  refer  to  the  Appendix,  ISP  descriptions,  and  PMS  diagrams. 


Aberdeen  Proving  Grounds  (see  EDVAC; 

ENIAC:  IAS) 
ACE  (NPL/National  Physical  Laboratory),  39, 
43,  44,  74,  190,  193-199,  216 
introduction,  193 
ISP,  193-199 
PMS.  191,  193,  19S 
T(io),  197-199 
ADU/Accumulation  and  Distribution  Unit 

{see  ComLogNet) 
AEC/Atomic  Energy  Commission,  396 
AGC/Apollo  Guidance  Computer  (M.I.T. 
Instrimientation  Laboratory),  44,  89, 
146-1.56 
D(arithmetic),  150-152 
design  and  construction,  148 
interpreter,  147-148 
introduction,  146 
ISP,  152-1.55 
PMS,  146-148 
Air  Force,  137 

ALGOL  language,  13,  45,  73,  257,  267,  348 
ALPAK  language,  45 
ALWAC  IIIE,  II,  44 
AMBIT  language,  45 
AN/FSQ-27  (see  R\V-40  and  400) 
AN/GYK-3(V)  (see  D825  and  D830) 
AN/UYK  (RW  =^  TRW),  71 
AOSP/Automatic  Operating  and  Scheduling 

program  (see  D825,  operating  system) 
APEXC,  .39 

APL/A  Programming  Language,  13,  45 
Apollo  (see  AGC) 
Argonne  Laboratory,  257 
Arithmometer  (L.  X.  Thomas),  46 
ARPA/Advanced  Research  Projects  Agency, 
291 -.300,  315 
network,  510-512 
Arrow  (see  Strela) 
ASI  6000  (EMR),  44 

Atlas  (Manchester  University,  Ferranti),  43-45, 
82,  91,  274-290 

input-output,  274-283,  285-289 

internipt,  274,  276-277 

introduction,  276 

ISP,  276-279,  283-285 

M(core).  280-283,  289-290 

multiprogramming,  274-283 

operating  system,  279,  285-287 

PMS,  277,  279-283,  289-290 

RT,  287-289 
ATLAS-1  and  2  (Ferranti),  43 
AVIDAC,  .39-89 


B  160,  170,  180,  250.  260,  263,  270,  273,  280, 

283,  and  300  (Burroughs),  43-44 
B  2.500,  2.501,  and  3.500  (Burroughs),  43 
B5000  (Burroughs),  43,  44,  79,  81,  2.57-261, 
267-273 
design,  267 
ISP,  268-273 

operating  system,  267-268 

PMS,  258-260,  268 
B  .5.500  (Burroughs),  43-45 
B  6.500  and  B  7500  (Burroughs),  43,  45, 

2.57-261.  .325,  328 
B  8.500  and  B  8,501  (Burroughs).  43-44,  64,  257 
Babbage's  -Analytic  Engine,  42.  46,  .53 
Babbage's  Difference  Engine,  46 
Baldwin  Calculator,  46 
BASIC  (Dartmouth  College),  45,  236 
Bell  System,  .303 

Bell  Telephone  Laboratory  computers,  .39, 

42-43.  45-46 
Bendix  =.  CDC  (see  under  CDC;  G-15;  G-20) 
BESK,  .39,  89 
BESM,  213 

BINAC  (Eckert-Mauchly),  43,  91,  163 
BIT  480  (Business  Information  Technology), 
44 

Bitran  6  (Fabri-Tek),  44 
BIZMAC  I,  II  (RCA),  .39-43 
BTL  MACRO  language.  45 
BTSS/Berkeley  Time  Sharing  System 

(University  of  California,  Berkeley),  44, 
45,  274-275,  291-300 

input-output,  297-.300 

introduction,  291-292 

ISP,  291-297 

M(files),  297-.300 

multiprogramming,  291-295 

operating  system,  292-.3()0 

PMS,  275,  292 

T(io),  297 

Burroughs  (see  B  25(X);  B  .5000;  B  5500; 

B  6500;  B  8500;  D825;  Datatron  204,  205, 
and  220;  E  101,  102,  and  103;  ILLIAC 
IV) 


California,  University  of,  Berkeley  (see  BTSS) 
Carnegie-Mellon  L'niversitv,  120,  571 
CDC/Control  Data  Corporation  (see  G-15; 
G-20) 

CDC  160,  A,  G,  43,  44,  120 
CDC  924,  3100,  3200,  3300,  and  3500,  43-44, 
79 


CDC  1604,  44,  89 
CDC  1700,  44 

CDC  .3400.  3600,  and  .3800,  43-44,  348,  .396 
CDC  6400,  6416,  6.500,  6600,  6700,  and  7600, 
43-45,  47,  71,  76,  79,  83,  120,  170,  397, 
470-476,  489-503 

circuits,  494-495 

history,  470,  489 

ISP,  472,  491-493,  497-503 

operating  system,  472-475 

packaging,  494—196 

performance,  470-471 

PMS,  470,  471-475,  476,  489-494 

RT,  491-494 
CDC  8090  and  8092  (see  CDC  160,  A,  G) 
CDP/Communications  Data  Processor  (see 

ComLogNet) 
C.E.C.E.,  39 

Censvis,  Bureau  of,  157,  164-165 
CG24  (Lincoln  Laboratory),  43 
Chasm  special  purpose  computer,  73 
COBOL  60  and  61  language,  45 
Columbia  University  Calculator,  46 
COMIT  language,  .33,  45 
ComLogNet,  45,  509-51(1 
CORC  language,  45 

CPC/Card  Programmed  Calculator  (IBM),  43, 
88 

CSIRAC,  89 

Culler-Fried  on  line  language.  45 


D825  and  D8.30  (Burroughs),  44,  45,  257-260, 
446-4,55 
design  philosophy,  447-4.50 
input-output.  4,54-455 
ISP,  453 

operating  system,  4,50-455 
PMS,  260,  450-455 
DASK,  89 

Datamatic  1000  (Honeywell),  39,  43 
D.\TANET  .30  (GE),  43 
Datatron  204,  205,  and  220  (Burroughs),  .39, 
43,  44 

DDP-19  (Honeywell),  43 

DDP-24,  224,  and  124  (Honeywell),  43-44 

DDP-116,  316,  416,  and  516  (Honeywell), 

4.3-44,  512 
DEC/Digital  Equipment  Corporation  (see 

PDP-1) 

DEC  3.38,  260,  .303-314,  396 
interpreter,  .305 
introduction,  305 


656 


Machine  and  organization  index  657 


DEC  338,  ISP,  305-309,  310-314 

PMS,  121 

(See  also  PDP-8) 
Deuce  (English  Electric),  39,  43-45,  191 

(See  also  ACE) 
DMI/Data  Machine  Inc.  =>  Varian  Associates, 
44 

DMI  520/1  (Varian),  44 

DMI  620  (Varian),  44 

Dutch  Postal  and  Telecommunications 

Services,  200 
Dynamo  language,  45 

DYSEAC  (National  Bureau  of  Standards),  39, 
43,  172,  440 

E  101,  102,  and  103  (Burroughs),  43,  44 
EAI/Electronic  Associates  Inc.,  44 
EAI  640,  44 

Eccles-Jordan  Flip-Flop,  46 
Eckert-Mauchly  Computer 

Corporation  =■  U\'IV.\C,  91 
EDS.\C  I  and  II  (Cambridge  University),  .39, 

42-45,  5,S,  89,  1.39.  144,  196,  398  ' 
EDV.\C  Electronic  Discrete  Variable 

Automatic  Computer  (University  of 

Pennsylvania)  .39,  42-45,  95 
Eight-bit  character  computer,  170,  184-187, 

224 

introduction,  1.84 
ISP.  184,  185,  186-187 
EMR  61.30,  44 

English  Electric  =  ICT/International 

Computers  and  Tabulators  [see  KDF  9) 
ENI.\C  Electronic  Numerical  Integrator  and 
Computer  {University  of  Pennsylvania), 
.39,  42-43,  45-47,  88,'  113 
ERA/ Engineering  Research 

Associates  =>  UNIV.AC,  43,  192 
(See  aho  UNIVAC  1101,  1102;  UNIN'AC 
1103A) 
ERMA  isee  CE  100) 
ESS  Electronic  Switching  System  (Bell 

System).  .303 
EULER,  44,  73,  257,  348,  382-392 
interpreter  (microprogram),  385-392 
introduction,  382-383 
ISP,  .383-385.  388-391 
PMS,  3S2-.392 


Fabri-Tek  (see  Bitran  6) 
FACT  lanauagc.  45 

Ferranti  Corp.  Ltd.  =  ICT/International 
Computers  and  Tabulators,  39 
(See  also  .-Ktlas;  Mercury;  Pegasus) 

FLAG  (Florida).  .39 

FOCAL  (DEC)  language.  236 

FORMAC  (IBM)  language.  45 

FORTRAN  (IBM),  FORTRAN  II,  FORTR.\N 
IV  lat\guage.  45.  .50,  73,  .348 


FORTRAN  Machine,  44,  .348,  .363-381 

interpreter,  366-379 

introduction,  36.3-.364 

ISP,  363-365 

logical  design,  .365-381 

PMS,  .365-366 

RT,  .364-368,  375-381 
FX-1  (Lincoln  Laboratory),  43-45 


G-15  (Bendix  =•  CDC),  .39,  4.3-44,  74.  191 
G-20  (Bendi.v  =  CDC),  44,  .57,  1.52 
Gamma  60  (Machines  Bull),  44.  456 
GARDE  312  iGE),  43 
GE  1(K)/ERMA,  43 
GE  11.5,  43 

GE  205,  210,  215.  225,  235,  255,  and  265, 

43-44 
GE  412,  435,  43-44 
GE  6.35,  625,  43 

GE  645  (General  Electric),  43,  45,  79,  275 
GE  4040,  40.50,  4060,  4020,  and  40.50  II,  43 
General  .\utomation  (see  SPC-8) 
General  Precision  =■  CDC  (.see  LGP-.30) 
Genie  project  (see  BTSS) 

George  (University  of  New  South  Wales),  257 
Gott  Sei  Danke,  .346 
GPS  language.  45 


H-200  series:  110,  120,  12.5.  2(K).  400,  12(K). 
12.50,  2200.  32(H),  42(K),  and  8200 
(Hone\Avell),  43,  44,  58,  225 
11-1400  and  1800  (HonevAvell),  43 
Harvard  (see  Marks) 
Hollerith  Punched  Cards.  46 
Honeywell  [see  Dataniatic  1000;  DDP-I9; 

DDP-24;  DDP-116;  H-200;  H-I4(X)) 
Host  computer  isee  .\RP.\  network) 
HP  Hewlett-Packard  [see  HP  9100A) 
HP  9100A,  44,  23.5-236,  243-2.56 

D.  243-244,  2.54-256 

ISP.  243-249 

microprogram,  254-256 

packaging,  250,  252-253 

PMS,  235,  249-254 

RT,  2.50 

T,  243,  248,  253 


I.\S/Institute  for  Advanced  Studies  machine 

(see  von  Neumaim) 
IBM  ASP/Attached-Support  Processor,  506 
IBM  305  (disk),  43,  45 
IBM  650,  39,  43,  44,  91,  216,  220-223 

ISP,  220-223 
IB.\1  701,  .39,  43-45,  47.  89,  51.5-516 

P.MS,  515 

(See  also  IBM  7094) 


IBM  702,  39,  43,  47,  87 

IBM  705,  705  III,  708,  and  7080,  .39.  4.3-44, 

47,  87,  433 
IBM  11.30  (see  IBM  1800) 
IB.M  1401,  1440,  and  1460,  4.3-45.  47,  61,  188, 
224-2,34,  562-564 

history,  225 

interpreter,  229 

introduction,  225-226 

ISP,  226-229,  231-234 

PMS,  226 

RT,  229-2.30 
IB.M  1410  and  7010,  4.3,  44 
IBM  1620,  III,  and  1710,  43-44,  225 
IB.M  1.8(K)  and  1130,  43-45,  48,  .55,  90,  .396, 
.399-420,  470,  575-576,  579,  58.3-586 

input-output.  40.5,  409-411 

interpreter,  408-409 

introduction,  399-4(K) 

ISP,  407-416,  417-420 

PMS,  4(K)-405,  404 

RT.  40.5-409.  411-413 
IBM  2938,  45,  72 
IBM  7030  (see  Stretch) 
IBM  7070,  7072,  7074,  43,  44 
IBM  7094  I,  II,  7044,  7040.  7090,  709.  and 
704,  .30-.32,  39,  4.3-45,  47,  54,  64,  70,  79, 
91,  149,  .303,  .396.  422,  4.33,  51.5-541, 
562-.564 

history,  51.5-517 

interpreter.  .522-523 

ISP,  .523,  526-541 

multiprogramming,  523 

P(io),  .524-525 

PMS,  517-519 

RT,  520-522 
IBM  Multiplying  Calculator.  46 
IBM  Stretch  (sec  Stretch) 
IBM  System/360,  43-4.5,  61,  64,  303,  396 

addressing,  565-566,  .594 

array  processor,  576-579 

base  register,  594 

(See  also  addressing  above) 

bibliography,  587 

branch  instnictions,  595 

channel-to-channel  adapter,  576 

circuits,  564,  603-604 

cost,  579-585 

critique  by  authors,  561-587 
data  t\  pes.  564-565,  590-594 
design,  .561—564,  588 
direct  control,  597 

(See  ako  input-output  below) 
emulation,  .562-563 
floating  point,  .591-592 
fimctional  schematic,  .589 
general  registers,  564—565 
history,  561 

(See  also  design  above) 
infomiation  formats  [see  data  types  above) 
innovations,  562 


658  Machine  and  organization  index 


IBM  System/360,  input-output,  588,  598-601 

[See  also  P(io;  data  chajinels)  below:  PMS 
and  PMS  diagrams  below] 
interpreter,  594-595,  604-605 
interrupts,  596-597 
introduction,  561,  588 
ISP,  564-566,  588-601 
logical  structure,  588-601 

(See  also  ISP  above) 
M(content  addressable),  571,  573-574 
M(Large  capacity  store),  571-572,  582-583 
M(read  only),  604-605 

{See  also  microprogramming  belmv; 
Models  30,  40,  and  50  below) 
microprogramming,  563-564,  604-605 

{See  also  Models  30,  40,  and  50  below) 
model  range,  561-564,  588,  602-606 

(See  also  performance  below) 
Model  20,  563-567 
Model  25,  184,  563,  567,  569 
Model  30,  236,  348,  382-392,  566-568, 
602-606 

ISP,  385-388 

microprogram,  382-385,  388-392 
RT,  386 

Model  67,  76,  79,  275,  561,  563,  571, 

573-574 
Model  75,  561,  563,  571 
Model  85,  76,  561,  563,  574-575 
Model  91,  561,  563,  575 
Models  30,  40,  and  50,  561,  563,  566,  568. 

602-603 

Mp,  563,  571-572,  582-583,  602-603 
multiprocessing,  456-469,  585-587 
multiprogramming,  565-566,  571,  573-574, 

597-598 
networks,  576-579,  581,  598 

(See  also  IBM  ASP) 
performance,  563,  579-587,  602-606 
P(io;  data  channels),  573-574,  576-577, 

598-601,  605-606 
PMS  and  PMS  diagrams,  563,  566-579, 
602-606 

K(special  controls),  576 

Model  20,  567 

Model  44,  569-571,  569 

Model  67,  571,  573-574,  573 

Model  75,  567,  571-572 

Model  85,  574-575,  575 

Model  91,  575 

Models  30,  40,  and  50,  65,  566-568, 
566-567 

Ms  (data  cell,  disk,  drum),  577,  579 
Ms(niagnetic  tape),  578-579,  578 
P(array),  576-578,  576 
P(special),  576-578,  576 
S(c),  579,  581 

(See  also  networks  above;  IBM  ASP) 
T(analog),  581 
T(audio),  579 
T(display),  579 


IBM/System  360,  PMS  and  PMS  diagrams, 
T(print,  punch,  read),  580 
T(telephone,  typewriter),  579,  581 
processor  state,  564-565,  588,  596-598 
RT,  568,  570,  572,  603-604 
S(cross-point;  time-multiplexed;  BCU),  573 
SLT/ Solid  Logic  Technology,  564,  603-604 
storage  protection  {see  multiprogramming 
above) 

storage-to-storage  channel,  576-577 
SVC/Supervisor  Call,  597 
system  implementations,  602-606 
timer,  597 

variable-length  character  strings,  591 
ASCII,  593 
decimal,  593-594 
EBCDIC,  592 
ICT/Intemational  Computers  and  Tabulators, 
91,  274 
(See  also  Atlas;  KDF  9) 
ILLIAC  I  (University  of  Illinois),  39,  43-45, 
89 

ILLIAC  II  (University  of  Illinois),  43 
ILLIAC  III  (University  of  Illinois),  43,(l5]j 
ILLIAC  IV  (University  of  Illinois),  43-45,  47, 
66,  72,  315,  320-330 

input-output,  322,  327-328 

interpreter,  322-325 

introduction,  320-321 

ISP,  322-325,  330-333 

PMS,  321-322,  327-329 
K(P),  322-323 

RT,  326 
Illinois,  University  of,  43 

(See  also  under  ILLIAC) 
IMP  computer  (see  ARPA,  network) 
Instmmentation  Laboratory,  M.I.T.  (see  AGC) 
Interdata,  Model  3  and  4,  44,  184 
IPL  1,  II,  III,  IV,  and  V,  45,  257 
IPL  Vl/'Information  Processing  Language,  44, 
45,  73,  257,  348-362 

design,  349-350 

interpreter,  351,  354-355,  359-362 
ISP,  354-359,  361-362 
RT,  352-354 
IPL  VC,  257 


Jacquard  Punched  Card  Loom,  46 
JOHNNIAC  (RAND),  43-44,  78,  89 
JOSS  (RAND)  language.  45,  78 
JOVIAL  (SDC)  language,  45 


KDF  9  (English  Electric),  44,  257-266 
D,  263-266 
introduction,  262 
ISP,  262-263 
PMS,  260 
RT,  264 


LARC  (UNIVAC),  43-44,  86,  396-397 
Lehman  Computer  example  (IBM  Research), 
44-45,  446,  456-469 

application,  464-469 

design  philosophy,  456-457 

instructions,  457-461 

internipt,  458-461 

introduction,  456 

operating  system,  461-463 

performance,  456-457,  463-469 

PMS,  459-461 

simulation,  463—469 
Leibniz  Calculator,  46 
LEO  I  and  II,  39 

Leprechan  (Bell  Telephone  Laboratories),  43 
LGP-30,  and  LGP-21  (General 

Precision  =3  CDC),  44,  45,  74,  91,  192. 
216-219 
ISP,  217,  218-219 
PMS,  217 

LINC/ Laboratory  Instrimient  Computer 

(M.I.T.  Lincoln  Laboratory),  43,  44,  120 
LINC-8  (DEC)  {see  PDP-8) 
Lincoln  Laboratory  (M.I.T),  571 

(See  also  CG24;  FXl;  LINC;  MTC;  TX-0, 
TX-2) 

LISP  I.O  and  1.5  language,  45 
Lockheed  Electronics  {see  MAC-I6) 
Los  .\lamos  (see  \EC) 
LRL/Lawrence  Radiation  Laboratory, 

Livermore,  California,  396—397 
LRL  network,  507 


MAC-16  (Lockheed  Electronics),  44 

MAD  language  (University  of  Michigan),  45 

MADM/Manchester  Automatic  Digital 

Machine,  39,  58 
Manchester  University,  .39,  45,  .340 

(See  also  Atlas;  MADM;  Mark  I;  Muse) 
MANL^C  I  and  II  (University  of  California, 

Los  Alamos),  39,  43,  89  ' 
Mark  I  (Manchester  University),  43 
Mark  I,  II,  III,  and  IV  (Harvard),  39,  42^3, 

46 

Mathmatic  language,  45 
MEG,  39 

Mercury  (Ferranti),  39,  279 

Michigan,  University  of,  MAD,  MIDAC,  192, 

209-212,  .571 
MIDAC  (Michigan,  University  of),  39,  44,  192, 

209-212 
ISP,  209-212 
MILSMAC,  347 
MISTIC,  43 

M.I.T.  CTSS  operating  st/ittem,  45 
M.I.T./Massachusetts  Institute  of  Technology 

(see  AGC;  GE  645;  Lincoln  Laboratory; 

MULTICS  project;  Whirlwind  I) 
M.I.T.  network,  507 
Monorobot.  Monorobot  .\I,  39,  44 


Machine  and  organization  index  659 


N4onroe  Calculator,  46 

Monroe  Corporation,  46 

Moore  School  of  Electrical  Engineering  {see 

Pennsylvania,  University  of) 
MOSAIC,  39 
Motorola  10()0,  44-45 

MTC/Memory  Test  Computer  (M.I.T.  Lincoln 

Laboratory),  39,  45,  89 
Mueller's  Difference  Engine,  46 
MULTICS  project  (M.LT.),  45,  571 
Muse  (Manchester  University),  43,  277 


NBS/National  Bureau  of  Standards  (see 

DYSEAC;  PILOT;  SEAC) 
Neher  Laboratory,  200 
Network  of  Computers,  504-512,  505-512 

ARPA,  510-512 

ComLogNet,  5(«-510 

IBM  ASP,  .506 

LRL,  507 

M.I.T.,  507 

SABRE,  504 

SAGE,  504 

Texas,  University  of,  506-507 
typical,  508-509 
NORC,  39,  44 

NOVA  (LRL/Lawrence  Radiation 
Laboratory),  44,  66,  315-319 
applications,  316-.317 
introduction,  316 
ISP,  317-318 
RT,  318 

NPL/National  Physical  Laboratory,  45 
(See  also  ACE) 


Olivetti-Underwood  (see  Programma  101  Desk 

Calculator) 
ONR,  Office  of  Naval  Research,  137 
ORACLE,  89 

ORD\'AC  (University  of  Illinois),  39.  43,  89 


Pascal  Calculator,  46 

PB  Packard  Bell  =^  Raytheon  ^see  PB-250; 

PB-4401 
PB-25(),  44,  74,  191 
PB-440,  334 
PDC  808.  816,  44 
PDP-1  (DEC).  44-45 
PDP-4.  7,  9,  and  15,  43-45 
PDP-8,  8S,  81,  8L,  and  5,  20-32,  4:3-44,  49,  90. 
120-136,  396 

applications,  120 

circuits,  132-133 

input-output.  123 

interpreter,  131 

interrupt,  123 

ISP.  22-a3,  120-123,  127,  134-136 
Logical  design.  127-133 


PDP-8.  8S,  81.  8L,  and  .5.  M(core).  128-129 

PMS,  20-21,  123-131,  121,  124,  126,  128 

RT,  125,  127-131 

(See  aho  DEC  .338) 
PDP-10  and  6,  43-45,  79,  170,  27.5,  .564 
PDP-12,  LINC-S  (see  PDP-8) 
Pegasus  (Ferranti),  44,  62,  170-183,  564 

circuits,  171-174,  176 

introduction,  181 

ISP,  176-179,  182-18.3 

logical  design,  172-17.5,  179-181 

packaging,  174-176,  179-182 
Pennsylvania,  University  of  (Moore  School), 

43,  46,  95 

(See  also  EDVAC;  ENI.-\C) 
Philco  212,  44 

PILOT  (National  Bureau  of  Standards)  39,  43, 

44,  75,  397-.398,  440-445,  449 
applications,  440 
input-output,  444-445 

ISP,  442-444 
performance,  440-442 
PMS  398,  440-442 
Pol\  niorphic  (R\\  i  isec  R\\  -40  and  400) 
Programma  101  Desk  Calculator 

(Olivetti-Underwood),  44,  216,  235. 
237-242 
ISP,  237-242 
PMS.  2.37-238.  237 
Programmed  Console  (Washington  University), 
"  120 

PUFFT.  compiler.  45 

RAND  Corporation  (see  JOHNNIAC;  JOSS) 
R.\'iX>AC  I  Raytheon).  39 
RC.\  Radio  Corporation  of  .\merica  (see 
BIZMAC  I.  II;  SPECTRA  70  Series) 
RCA  1 10.  43.  44 
RCA  .301  and  3301.  43 
RCA  .501  and  601.  43.  44.  225 
RCA  1600.  184 
RCA  Spectra  70,  .561-.562 
Recomp  I,  II,  and  III.  44 
Rice  University  computer,  45,  53 
RWHadio  Wooldridge  (see  AN/UYK) 
RW-IO  and  400  (Thompson.  Ramo, 

W  ooldridge),  44,  53,  192,  4(X),  470-471, 
477-4.S8 

design  philosophy.  477 

interrupt.  481—482 

ISP.  470.  480-482 

ISP  language.  486-188 

P.MS.  471.  477-480.  4.82-4,85 


S.\BRE  network  (.American  .\irlines),  45,  .504 
S.\GE  Semi-.\utomatic  Groimd  Environment 

network,  45,  504 
sec/Scientific  Control  Corp.  650,  120 
Schickhardt  Calculator.  46 


SD-2  (Librascope),  44,  :334,  341-.347 

design,  341-343 

interpreter,  .5.50-.552 

introduction,  341 

ISP,  .343-.347 

microprogram,  .345—346 

packaging,  .341-343 

PMS,  .343 

RT,  .343-345 
SDC  Systems  Development  Corp.,  45 
SDS,  Scientific  Data  Systems  =^  XDS/.\erox 
Data  Systems  (see  SDS  910;  SDS  940  and 
945;  Sigma  2  and  3;  Sigma  5  and  7) 
SDS  92.  44.  120 

SDS  910.  920,  92.5.  930.  9.3(K),  4.3.  44.  91,  291, 
.542-.560 
history,  .542-543 
input-output,  .543-545,  552-555 
interpreter,  .551-5.52 
interrupt,  .553-555 
introduction,  .542-.543 
ISP,  .544-.545,  .548-5.50.  556-560 
PMS,  275.  543.  .546-548.  546 
RT.  550-5.52 
(See  aho  BTSS) 
SDS  940  and  945  (SDS,  University  of 

California.  Berkeley),  4.3-44,  79.  275. 
291-3(K).  .542 
(See  also  BTSS) 
SE.\C  (National  Bureau  of  Standards).  .39, 

4.3-45.  172.  192,  209-212,  440 
SEL  Systems  Engineering  Laboratories,  44 
SEL  810,  44 

Sigma  2  and  3  (SDS  =^  .\DS),  43-44,  78 

Sigma  5  and  7  (SDS  =^  .\DS|.  43.  170,  .396,  .564 

SILLL\C,  89 

SI.MSCRIPT  language,  45 

SIMUL.\,  language,  45 

SNOBOL  language,  45 

SOL  language.  45 

SOLOMON,  315,  320 

Soviet  .\cadem\'  of  Sciences,  213 

SPC-8  and  12.  44 

SPECTR\  70  Series  (RCA),  43 

SS  80  I  and  II  (UNIVAC),  43 

Strela/.\rrow  (Russian),  44,  192,  213-215 

ISP.  21.3-215 
Stretch  IBM  70,30.  4.3-4.5,  47.  91.  396-.397, 
421-439 

arithmetic,  428-431 

circuits,  433-438 

D,  427-431 

input-output,  421-422 

interrupt,  423 

introduction,  421 

ISP,  422-424 

K(P),  424-428 

look-ahead,  426-428 

packaging,  432,  4.38-1.39 

performance,  421-42.3,  425-426,  431-433 

PMS,  421-42.3,  42.5-426 


660  Machine  and  organization  index 


Stretch/IBM  70.30,  RT,  426-431 
Subscriber  Station  {see  ComLogNet) 
SWAC,  39,  43 

System/360  {see  IBM  System/360) 


Texas,  University  of,  network,  506-507 
Toronto  University  Computer,  44 
TRAC  language,  45 
TRE,  39 

TRW/Thompson,  Ramo,  Wooldridge  {see  RW-40 

and  400) 
Turing  machine,  23 

TX-2  and  TX-0  (Lincoln  Laboratory,  M.I.T.), 
39,  43-45,  274 


UNCOL  language,  8-9,  13 

U.S.  Army  Ordnance  Department,  92 

UNIVAC,  .39,  43-45,  48,  91,  1.57-169 

applications,  164-165 

design  constraints,  163 

input-output,  158,  161-162 

interpreter,  L59-161 

ISP,  157-160 

performance,  164-168 

PMS,  158 

reliability,  165-169 

RT,  157-160 

T(io),  161-163 

(See  aho  SS  SO  1  and  II) 


UNIVAC  11  and  III,  39,  43-45 
UNIVAC  418,  1218,  and  1818,  4.3-44 
UNIVAC  490,  491,  492,  and  494,  43-44 
UNIVAC  1004  I,  II,  III,  1005  I,  II,  and  III, 

43,  44 
UNIVAC  1050,  43,  44 
UNIVAC  1101  and  1102,  .39,  43 
UNIVAC  1103A,  39,  43,  44,  48,  62,  192, 

205-208 
ISP,  205-208 
UNIVAC  1105,  39,  43 

UNIVAC  1108,  1107,  and  1106,  10,  43-45,  62, 

170,  192,  564 
UNIVAC  1206,  43 
UNIVAC  1212  (Military),  43 
UNIVAC  9200  and  9300,  43 


Varian  Associates  (see  under  DMI) 

von  Neumann/IAS/Institute  for  Advanced 

Studies,  39,  42,  44,  ,58,  89,  92-119,  152, 

398 

applications,  92-93 
checking,  118 
D,  96-111 

design  constraints,  92-93 
input-output,  92,  117,  119 
interpreter,  111-119 
ISP,  111-119 
M,  94-96 


WEIZAC,  43,  89 

Whirlwind  1  (M.I.T.),  10,  39,  43-45,  .55, 
58,  90,  137-145,  303,  470 
applications,  138 
D,  142 

interpreter,  140-141 
introduction,  137-139 
ISP,  145 
K,  139-143 
M,  141 

packaging,  141-143 
PMS,  90,  1.38-1.39 
Wilkes'  microprogrammed  computer  e.\am| 
44,  335-.340 
design,  3.35-337 
introduction,  335 
ISP,  337-340 
microprogram,  339-340 
RT,  336 


XDS/Xerox  Data  Systems  (see  SDS) 


ZEBRA  (Standard  Telephones  and  Cables, 
Ltd.),  44,  191-192,  200-204,  216 

introduction,  200 

ISP,  200-204 

PMS,  201 
ZUSE  Company,  39,  42 


Subject  Index 


Page  references  in  boldface  refer  to  the 

abbreviation/,  19,  607,  609 
acceptance  test,  UNIVAC,  165-166 
access-i-unit-operation,  63.3 
access-time,  620-622 
accessing  algorithm,  41 
accumulator,  ZEBRA.  202 
acciunulator  register,  .59-6(),  98 
accuracy,  HP  9100  A,  246,  2.56 
acomtic  delay  line,  96 

[See  also  under  M(delav  line)) 
action  ^,  23-24,  631-632 
action-sequence,  2.3,  631 
actual  address,  76-81 

(Sec  also  physical  address) 
adaptability: 

D825,  447-448 

R\V-40(),  477-479 
adder,  Pegasus,  174 
addition,  von  Neumann,  98-99 
address-expression,  631-632 
address-range  [  ],  24,  631-633 


lix,  ISP  descriptions,  and  PMS  diagrams. 

arithmetic  element.  Whirlwind,  142 
arithmetic  expression,  614 
arithmetic-function-operation,  614 
arithmetic  organ,  von  .Neumann,  98 

(See  also  D/ data-operation) 
arithmetic  unit,  KDF  9,  26.3-266 
array  instructions,  NOVA,  316-319 
array  processor  [see  P(array)) 
.\SCII/.'\merican  Standards  Code  for 

Information  Interchange,  .593 
assemble  instruction,  457-458 
assignment;  =  ,  2.3,  607,  609 
associative  memory  [see  look-aside  memory; 

.VI(a.ssociative)] 
attribute,  19,  607,  612-613 
attribute-list,  612-613 
attribute;  value  pair  (see  attribute;  value) 
auto  index  register,  120-122,  1.34 
availabilitv,  447 

Lehman  computer,  456-457 
available  space  list,  IPL  VI,  352-353 


address-size,  P,  626-627 
addresses-per-iiistruction,  P.  57-63,  627 

(See  also  instruction  format) 
addressing  (sec  memory  addressing;  memory 

mapping;  multiprogranmiing) 
addressing  system,  memory,  16 
aerospace  computer,  146-1.56 
algorithm-encoding-efficiencv.  P,  627 
alias/,  19,  607,  609 
alphabet,  609,  613 

alternation!,  indefinite  expression,  17,  610 
and  A,  25 

(See  also  n-ary-boolean-operation) 
antecedent,  619 
applications; 

Lehman  computer,  464-469 

NOVA,  316-317 

PDP-8,  120 

PILOT,  440 

UNIVAC  I,  164-165 

von  Neumann,  92-93 

Whirlwind  I,  1.38 
approximation-,  607-608,  610 
architecture,  .562 

(See  also  ISP;  under  PMS) 
archival  memory  [see  M(archival)] 
area,  617,  619 
arithmetic: 

multiple-precision,  AGC,  151-152 

parallel,  429-4.30 

serial,  428-429 

Stretch.  428-431 


b  (see  bit) 
B  line; 

Atlas,  277-278 

.Manchester  I'niversitv,  .340 

(See  also  index  register) 
barrel,  CDC  66(H),  474.  489-491 
base,  24,  .5.5-.56,  614,  616,  631 
base-data-tvpe,  630-631 
base  register,  .MIDAC.  210 
bench-mark,  52 
bilinear  switch,  623-824 

binary-arithmetic-operation -I-  — ,  614,  633-635 
binary-boolean-operation,  615,  633-635 
binarv-decimal  conversion,  2 1 1 

(See  also  ISP,  IBM  System/360) 
binary  machine,  87-88 
binary-operation,  28,  633 
binary-value,  61 1 
bit/hinarv-digit,  611,  616-617 
bit  string,  317-318 

(See  also  data-tvpe.  Stretch) 
block,  617 

block  diagram  (see  PMS  diagram;  PMS  level; 
RT) 

block  transfer,  ZEBRA,  204 
BNF/Back-us-Normal  Form  (Backus-Naur 

Form),  9 
boolean,  608,  615 
boolean-expression.  615 


boolean-operations  =  @  D  V  A  -],  608-609, 
633-635 

branch  instmctiou,  .595 
breakouts,  IPL  VI,  .3.50-.351 
buffer  module,  R\\  -400,  482-484 
bulk  core  memory  (see  M/memory) 
bus,  10  (See  also  S/switch) 
business  computer  (see  function) 
buzzer,  ACE,  198 
by  /ln  te,  616 
'  IBM  Stretch,  42.3 
IB.M  System/.360,  591 

C/computer  (see  computer) 

C(l  Pc),  40-41,  63-70,  .395 

C(l  Pc-nPio).  40-41,  6.3-70,  .396-398 

capital  letters.  609 

card,  IBM,  617 

carrier,  618 

data-type,  629-631 
carry,  98-99 

casting  out  three,  Stretch,  431 
central  processor  [see  P(c)] 
channels  [see  P(io)] 
character-ba.se,  631-632 
character/char,  616 
character  generation  instruction,  .308 
character  string.  184-185 

[See  also  variable-length  character  string) 
checking: 

Stretch,  431 

UNIVAC,  160-161,  168-169 

Whirlwind,  143-144 
circuit  level,  4 
circuits: 

CDC  6600,  494-495 

component  count,  470-471 

PDP-8,  132-1.33 

Pegasus,  171-174,  176 

Stretch,  4.33-438 

component  count.  431-432 
class.  609-610 
cocomponent,  617 

co-incident  current  memorv  [.see  M(core)] 
colon  :  ,  19,  612-613,  631 

{See  also  attribute:\"alue  pair) 
combinatorial  circuits,  5 
comma,  611 
commands.  608-610 

[Sec  also  abbreviation;  assignment,  form; 
_yariable> 
<;COMMENT,  608 


662   Subject  index 


comments,  608 

communication  computer  (see  function) 

communication  multiplexing,  .505 

compiler,  EULER,  391-392 

complex,  data-type,  631 

complex  number  arithmetic,  246,  255-256 

component: 

data-type,  629-631 

PMS,  616-619 
component-fimction,  617 
component-name.  617 
compound  computer,  628 
compound-link,  619-620 
compoimd  name  (see  name) 
computer,  628 

control,  146-156 

duplex,  66 
computer  levels,  3-11 

PDP-8,  126-127 
computer  model,  63-66 
computer-space  dimensions,  40 
concatenation  dl.  24,  631-633 
concurrency,  617-618 
concurrency-type,  617 
condition,  2.3,  631 
condition  codes,  IBM  1800,  407 
conditional  micro-order,  336-337 
configurator,  IBM  1800,  400-403 
construction  (see  packaging) 
content  addressable  memory  [see  M(content 

addressable)] 
contextual  addressing,  267-268 
continuous-modulation,  618 
control,  624-625 

ILLIAC  IV,  322-323 

Stretch,  424-428 

Whirlwind,  I.39-I42 

(See  aho  interpreter;  K/controI;  RT) 
control  computer  (see  fimction) 
control-operation,  633 
control-organ,  von  Neumann,  111-119 
controUed-operation,  624-625 
conversion,  615-616 

eonversion-arithmetic-operation,  633-634 
Cooley-Tukey  algorithm,  73 
cooling,  470 

Pegasus,  181 

UNIVAC,  163 
core  memory  [see  M(core)] 
cost,  616-617,  619 
count-expression,  614 
country,  619 
cross-point  switch,  267 

crossbar  switch,  [see  S(crossbar);  under  S(cross- 
point)] 

CRT/Cathode  Ray  Tube  display  [see  under 

T{CRT)] 
current,  616 
cycle-time  (see  memory) 
cyclic  memory  [see  M (cyclic)] 
cyclic  switch,  623-624 


D/data-operation,  17,  23-36,  626 
d/decimal  digit,  616 
D(Stretch),  427-431 
data  break,  PDP-8,  124-126 
data  channel  [see  P(io)] 

IBM  7094,  523-525 

SDS  900  series,  ,543.  .546-548,  .552-555 
data-expression,  631 
data  field  register,  120,  523 
data  flow.  Stretch,  425-428 
data-operation,  17,  23-36,  626 
data-operation  definition,  ISP,  636-637 
data-operations  table,  633-635 
data  programs,  IPL  VI,  360 
data  structure,  IPL  VI,  351,  354 
data-type,  23-.36,  57 

ISP,  629-631 

P.  626-628 

Stretch.  423-424 
data-type  format,  ISP,  636-637 
data-type-name,  629-631 
decimal,  614 
decimal  digit,  616  L  .  C  , ' 

decimal  machine,  57,  87-88 
decimal-name,  614 
DECtape,  124-126 
definite  expression,  607-608,  611-612 
definition:  —  (see  assignment) 
delay,  I,  620 

delay  line  [see  under  M(delay  line)] 
dequeue  switch,  623-624 
descendants,  619 
descriptor,  79-81 

B  5000,  271-272 
design  philosophy: 

D82.5,  447-4.50 

Lehman  computer,  456-457 

SD-2,  .341 -.343 
desk  calculators,  235-256 
destination  address,  ACE,  194-199 
digital  computer  {see  C/computer) 
digital  differential  analyzer,  .304 
digits,  609 
dimension,  608,  615 
dimension-expression,  615 
direct  access  communications  channel,  SDS 

900  series  (see  data  channel) 
direct  memory  access,  PDP-8,  124-126 
direction,  618 

directive  instructions,  Lehman  computer,  459 

discrete-modulation,  618 

disk,  74,  577,  579 

display  processor  [see  P(display)] 

distribution,  switch,  623-624 

divergence,  T,  625-626 

divergence-rate,  T.  625-626 

divide  step,  SDS  900  series  (see  ISP) 

division; 

nonrestoring,  107-111 

restoring,  107-111 

UNIVAC,  1.59 


drum  [see  M(drum)] 

dynamic  data  types,  383 

dynamic  storage  allocation,  .383-384 


EBCDIC/Extended  Binary  Coded  Decimal 

Interchange  Code,  592 
ECL/Emitter  Coupled  Logic,  320 
edit  instruction,  228 

effective  address  calculation  process,  ISP,  28, 

.59-60,  636-637 
efficiency,  processor,  626-628 
electrostatic  memory,  75 
element-range/ (    >,  24,  631-633 
ellipses.  .  .  ,  608,  610 
emulation,  562-563 
encode,  16 
encoding,  618 
entity,  608,  611-612 
error-rate,  617,  619 
evoke  operation  — >,  23,  631,  633 
'ISVMPLli' 608 
excess  three  code,  L'NIVAC,  163 
expansibility  criteria,  D825,  448 
expression,  608 

definite,  607-608,  611-612 

indefinite,  607-608,  610 

optional,  613 

(See  ako  boolean-expression; 

count-expression;  dimension-expression; 
relational-expression) 
expression-variables,  608 
extended  core  store/ECS,  CDC  6600,  473 
external  execute  instruction,  458 
extra  codes,  .597 

AGC,  154-1.55 

Atlas,  274-278 

(See  also  syspop) 


fabrication  (see  packaging) 
family  tree  of  computer  design,  39 
fast  Fourier  transform,  73 
features,  225-226 

fetch-execute  cycle  (see  interpreter) 
field,  data-type,  631 
file,  617 

BTSS,  297-.300 
file  control  (see  fimction) 
fixed  point  (see  data-type) 

number-data-type,  630-631 
fixed  structure  network,  504 
flag  bit,  IBM  1401,  226 
floating  point,  97 

Atlas,  277-278,  283-285 

B  .5(K)0,  268-270 

HP  91(K)  A,  243-256 

IBM  7094,  527 

KDF  9,  263-266 

number-data-tvpe,  630-631 

SDS  900  series,  544-545,  549-551 


Subject  index  663 


Hoating  point.  Stretch,  429-431,  433 

UNIVAC  11()3A,  208 

Wilkes  example,  335 
fork  instruction,  325,  457 
form.  607,  610 
format,  data-tvpe,  629-631 
full-duplex,  617-618 
fimction,  37,  40,  46-49 

business,  47-48 

C,  618 

communication,  48 
component,  617 
control,  48 
file  control,  48 
operation,  28 
P,  626-627 
scientific,  47 
T,  625-626 
terminal,  48-49 
time-sharing,  49 
Ruictional  units,  CDC  6600,  473,  494 


gate  tubes,  112-119 

general  conventions,  P.\1S  and  ISP.  607-615 
general  registers: 

8-bit  character  computer.  184-187 

Pegasus,  176-179 
generations  (first,  second,  third,  and  fourth), 

39-40,  43-46 
Gibson  mix,  49-.50 
graph-plot  instructions.  308 

Half-duplex,  617-618 
hexa-decimal-digit  hex.  616 
hierarchv  (sec  structure) 

switch,  623-624 
high-level  language,  B  .5()(K),  267 
high-speed  core  memorv  {see  M,  memory) 
history,  38-46,  617,  619 
hyphen-,  25,  607 
hyphen-name.  613-614 

i-rate.  617-618 

i-unit  information  unit.  16.  616-618 

base-unit.  616 

data-type,  629-631 

length,  616 
i-unit-name.  616 
i-unit-prefix 
IBM-card.  617 
iconoscope  tube.  94 
illegal  instruction,  BTSS.  293 
indefinite  expression,  607-608,  610 
index  #,  20,  613 
index  register.  59-60 
information,  616 

information  base.  24.  55-56.  614,  616,  631 
information-content,  data-tvpe,  629-631 


information  length,  16 
information-rate,  617-618 
information  units,  616-618 
inhibit  drivers  [see  M(core)] 
input-output: 

ACE,  197-199 

Atlas,  274-283,  285-289 

BTSS.  297-300 

D825,  4.54-455 

IBM  18(H),  405,  .5(W-411 

IBM  7094,  524-525 

ILLIAC  IV,  .322,  327-328 

PDP-8,  123 

PILOT.  444-445 

SDS  900  series,  .543-545,  552-.555 

Stretch,  exchange,  421-422 

UNTV.\C  I,  158,  161-162 
input  and  output  organ,  von  Neumann,  91, 

117,  119 
instruction: 

control.  UEC  338,  .308-3(W 

data.  DEC  338,  307-308 

ISP,  631-632 

special.  Lehman  computer.  457-461 
instruction  backup  register.  IB.M  7094, 

.520-522 
instruction  buffers,  84 

ILLIAC  IV,  32.3-324 

{See  also  look-ahead;  look-aside) 
instniction  decoding  diagram.  122-12.3,  184 
instniction-efficiency,  P,  626-627 
instruction  examples,  ISP.  632,  635-637 
Instniction^ execution.  ISP.  25-36.  637 
instruction  execution  process.  ISP.  R37 
instruction-expression.  23,  631-632 
instruction  format: 

0  address /stack,  62-64,  2.57-261 
stack:  B  5000.  268-273 

KDF  9:  262-266 

1  address.  58-60,  64,  87-91 
AGC.  149-1.50 

1+  general  register  {see  general  registers) 

1  -I-  1  address,"lBM  650,  220-223 
iH-  index  address,  .58-60,  87-91 

2  address,  60-61 

RW400,  470,  480-482,  486-488 
UMVAC  1103A,  205-208 

3  address,  60-61 
MIDAC,  209-212 
Strela,  213-215 

general  registers,  61.  64 

{See  also  general  registers) 
IBM  1800,  407-408.  410-411 
ISP,  25,  636-637 
n  +  1  address,  61,  191 
SDS  900  series,  .544-545.  548-552 
variable  number  of  addresses  per 
instruction.  63 
instruction  highway,  -\CE,  197 
instruction  interpretation  process,  ISP, 
636-637 


instruction  interpreter  {see  interpreter) 
instruction  look-ahead  {see  look-ahead) 
instruction-memory.  P.  627-628 
instniction  modification,  209-210 
instruction-set-  2.5 

ISP.  6.36-637 

K.  624-625 

P,  626 

(See  also  ISP) 
instruction-size,  P.  626-627 
instruction-source,  K,  624-625 
instruction  unit,  Stretch,  426—427 
integer-data-type,  630-631 
-f  integer-data-type,  630-631 
integer-name,  614 
+  integer-name,  614 
—  integer-name,  614 

integrated  circuit  memory  {see  M  memory) 
interaction  controller,  Lehman  computer,  460 
interaction  function,  Lehman  computer, 
458-461 

interference,  processor-memory,  46.3-469 
interflow,  151 

interlace  (see  data  channel,  SDS  900  series) 
interleaving  {see  memory  interleaving) 
interpretation-cycle,  22-36 

(  See  also  interpreter) 
interpreter,  22-36 

AGC,  147-148 

DEC  .338,  .305 

EULER  microprogrammed,  385—392 

FORTRAN  .Machine,  366-379 

IB.M  1401,  229 

IBM  1800,  408-409 

IBM  7094,  522-523 

ILLIAC  IV,  .322-325 

IPL  VI.  351,  3.54-355,  .359-362 

ISP,  6.36-637 

PDP-8,  131 

SDS  900  series,  5.50-552 

Stretch  (see  instruction  unit) 

UNIVAC,  1.59-161 

von  Neumann,  111-119 

Whirlwind  I,  140-141 
interprocess  communication.  41 
interprogram  communication.  81-83 
internipt  interprocess  internipts.  82—83,  411 

Atlas,  274-2.S3 

B  5000,  267-272 

D82.5.  452-453 

Lehman  computer,  458-461 

PDP-8,  123 

R\\'400,  481-482 

SDS  9(K)  series,  5.53-5.55 

Stretch,  423 
interrupt-response-time,  P.  626-627 
intraprocess  intermpt,  trap,  82-83 

I  See  also  e.xtra  codes;  trap) 
1  O  Bus; 

PDP-8,  124-126 

SDS  900  series  {see  input-outputi 


664  Subject  index 


ISP/Instruction-set  Processor,  12,  22-33 
ACE,  193-199 
AGC,  1.52-155 
Atlas,  276-279,  283-285 
B  5()()(),  268-273 
BTSS.  292-297 

CDC  6600,  472,  491-493,  497-503 
DEC  338,  305-309,  310-314 
D825,  453 

8-bit  character  computer,  184,  186-187 

EULER,  383-385,  .388-391 

FORTRAN,  .363-.365 

HP  9100A,  243-249 

IBM  6.50,  220-223 

IBM  1401,  226-229,  231-234 

IBM  1800,  407-416,  417-420 

IBM  7094,  523.  526-541 

IBM  System/36(),  Model  .30,  .385-388 

ILLIAC  IV,  .322-325,  .330-333 

IPL  VI,  354-358,  361-.362 

KDF  9,  262-263 

LGP-.30,  LGP-21,  217,  218-219 

MIDAC,  209-212 

NOVA,  317-318 

PD&-8,  22-25,  26-27,  28-.33,  120-123,  127, 
O..— 134-136 

Pegasus,  176-179,  182-183 

PILOT,  442-444 

Programma,  237-242 

RW-40,  RW-400,  470,  480-482,  486-488 

SD-2,  .343-.347 

SDS  900  series,  544-545,  548-550,  556-560 

Strela,  213-215 

Stretch,  422-424 

UNIVAC,  1.57-160 

UNIV.iiC  110.3A,  205-208 

von  Neumann,  111-119 

Whirlwind,  140-141,  145 

Wilkes  example,  337-339 

ZEBRA,  200-204 
ISP  conventions,  628-637 
itahcs,  24,  608 


join  instruction,  457 

K/control,  16-22 

(See  also  control) 
k/kilo,  616 
kernels,  464 
keyboard: 

HP  9100A,  235,  244-249,  251-253 

Programma  101;  237-242 

[Sec  also  T(keyboard)] 


L/link,  16-22,  619-620 
label,  612 
labeled-entity,  612 
language,  9 


large  capacity  store/LCS,  571-572,  582-583 
lattice  (see  structure) 
length,  616 

length-type,  data-type,  629-631 
level,  system,  3-4 
LlNCtape,  124-126 
lineage,  617,  619 
linear  switch,  623-624 
link,  619-620 
delay,  620 

port-to-port  delay,  620 
list,  607,  611 

list  processing,  EULER,  384 
list  structure,  IPL  VI,  .3.50 
literal  syllable,  B  50(K);  272 
location,  S,  62.3-624 
logic  diagrams,  PDP-8,  127-133 
logic  equations,  PDP-8,  127-133 
logic  technology,  40,  617-618 
logical  address,  76-81 
BTSS,  291 

(See  also  memory  mapping; 

multiprogramming) 
logical  design  level,  .5 

FORTRAN  Machine,  365-381 

PDP-8,  127-133 

Pegasus,  172-175,  179-181 
logical  structure  (see  ISP,  IBM  System/360) 
look-ahead: 

Atlas,  281-285,  287-289  . 

CDC  6600,  492-494 

IBM  7094,  550-552 

ILLIAC  IV,  323-324 

Stretch,  .397,  422,  424-428 
look-aside  memory,  84,  574 

[See  also  M(content  addressable)] 


M/memory,  16-22 

(See  also  memory) 
M(associative),  76 

[See  aim  M(content  addressable)] 
M(bulk  core),  74 
M(content  addressable),  74 

(See  also  look-aside) 
M(core),  PDP-8;  128-1.30 
M(cvclic),  73-74 

M(delay  line;  ACE,  Deuce),  191,  193-199 
M(delay  Hue;  Pegasus),  173-174,  177 
M(delay  line;  UNIVAC),  163 
M(drum),  74 

M(electrostatic;  Whirlwind  I),  141 
M(fixed-head  disk),  74 

M(fixed-head  disk:  ILLIAC  IV),  .322,  327-328 
M(large  storage;  Whirlind),  137-138,  141 
M(magnetic  card),  74 

M(magnetic  card;  HP  9100A),  248-249,  253 
M(magnetic  card;  Programma  101),  237-242 
M(magnetic  tape),  74 
M(magnetic  tape;  IBM  format),  126 
M(magnetic  tape:  RW-400),  483 


M(magnetic  tape;  Univervo),  157 
M(moving  head  diskpak),  74 
M(p/primary  memory),  17,  24,  74 
M(p;  concurrency),  41,  76-81 
M(p;  size),  41 
M(photostore;  IBM),  507 
M(punched  card),  74 
[See  also  T(punch)] 
M(queue),  73 
M(random),  75 
M(read  only),  604-605 

M(read  only;  capacitor;  Svstem/360:  Model 

30),  385-387 
M(read  only;  HP  9100A).  235,  250-253 
M(read  only;  rope:  AGC),  146-147 
M(s/secondary),  74 
M(stack),  73 

M(stack;  B  .5000),  269-271 
M(thin  film;  D825),  45.3-454 
M(toggle  switch;  Whirlwind  I),  142-143 
M(UNIVAC),  1.58,  164 

machine-independent  language,  B  50(X);  267 
macro-parallelism,  456,  463 
magnetic  card  [see  M(magnetic  card)] 
magnetic  tape  [see  M(magnetic  tape)] 
magnetic  wire  memory,  96 
main  line  of  computers,  87-91 
maintenance: 

ILLIAC  IV,  328-329 

Pegasus,  181-182 

UNIVAC,  165-169 

Whirlwind  I,  1.38-139,  142-143 
manufacturer  catalog  number,  617 
manufacturer  name  (see  proper-name) 
manufacturer-type,  619 
map  (.see  memory  map;  multiprogramming) 
marks,  609 

master  control  program,  B  .50(K),  267-268 

master  slave  schemes,  D825,  449 

matrix  multiply  problem,  Lehman  computer, 

464-466 
medium,  618 
memory,  620-622 

access-time,  620-622 

cycle-time,  620-622 

function,  620-622 

information-rate,  620-622 

operations,  620-622 

permanency,  620-621 

portability,'  620-621 

primary,  621 
[See  also  M(p)] 

processor  state,  621 

secondary,  621 
[See  also  M(s)] 

size,  620-622 

technology,  620-622 

(See  also  M/memory;  memory  organ) 
memory  access  algorithm,  73 
memory  addressing: 

AGC,  155-156 


Subject  index  665 


memory  addressing:  (con(. ) 

SDS  900  series,  542,  549-550 
memory  bus.  Stretch,  422,  426 

{See  also  S/switch) 
memory  declaration,  36 
memory-expression,  631-632 
memory  interface  connection,  SDS  900  series, 

543,  546-.548,  555 
memory  interleaving: 

Atlas,  289-290 

CDC  6600,  473,  493 

IBM  7094,  517-522 

ILLIAC  IV,  322-324,  327-328 

Stretch,  397,  421-422 
memory  map: 

BTSS,  291-295 

IBM  7094,  523 
memory  mapping,  77-80 

(See  also  multiprogramming) 
memory  organ,  yon  Neumann,  92-96 
memory  protection,  IBM  1800,  408 
memory  violation,  BTSS,  294-295 
message  concentrator,  120 
message  switching.  505 
metanotation,  607-609 
micro-operation,  Wilkes,  335-337,  339 
micro-order: 

System/360,  Model  30,  385-388 

Wilkes,  335-337 
micro-parallelism,  Lehman,  456 
micro-programme,  Wilkes,  .3.35 

[See  also  P{microprogram)] 
microprogram: 

control  fields,  387 

HP  9100A,  254-256 

sequencing,  388 

status  bits,  388 

symbolic  representation,  3S8-3S9 

[See  also  P(microprogram)) 
microprogram  processor  [see  P(microprogram)] 
micro-subroutines,  Wilkes,  .3.39-.340 
mixed  number,  data-type,  630-631 
MOBF/mean-operations-betvveen-failure, 
617-618 

modular  scheme.  DS25,  449—1.50 
modulation,  618 
monitor  map.  BTSS.  291-295 
monitor  mode,  BTSS,  291-297 
moving  head  disk,  74.  577.  579 
Mp-concurrency.  processor,  627-628 
MTBF/mean-time-between-failure.  617-618 
multiple  addresses  per  instniction.  191 

(See  also  instniction  format) 
multiple  data  stream.  83-84 
multiple  instniction  stream.  8.3-84 
multiplex.  617-618 

multiplexer,  memory,  IBM  7094  II,  518-519 

(See  also  S/switch) 
multiplication.  100-111 

AGC,  1.52 

UNIVAC.  157 


multiplication.  Whirlwind.  142 
multiplier,  615 

multiplier-quotient  register.  .59 
multiply  step,  SDS  9(X)  series  {see  ISP) 
multiprocessing,  446-469 
multiprogramming,  76,  274-275,  456—469 

Xtlas.  274-283 

B  mm-  267-268 

BTSS,  291-295 

(.See  also  memory  map:  multiprocessing; 
parallel  processing) 


n-ary-arithmetic-operation,  614,  633-635 
n-ary-boolean-operation.  615,  633-635 
n-ary  operation.  633 
name,  607,  609,  613-614 

component,  617 

compound.  2.5.  614 

hyphen,  613-614 

phrase.  613-614 

primitive.  613-614 

proper.  607,  617 

simple,  607,  613-614 
name-expression,  613 
nesting  store,  26.3-266 

[See  also  iVl(stack)] 
network,  628 

network  Siatvsis  problem.  Lehman  computer. 

466-469' 
network  computers.  447.  470-503 
next,  24,  631 

noisy  mode  Hoating-point,  422-423 
nonarv  operation.  633 
null.  607,  613 
number.  608,  614 
number-data-type.  630-631 
number-name,  614 

number  representation,  .\GC.  150-152 
number-set-name,  615 


octal-digit,  616 

one-level  store.  Atlas.  179-283 
one's  complement.  .\GC.  150-152 
onion  peeling.  Lehman  computer,  462-463 
operand  call  syllable,  B  5000,  272 
operating  system: 

Atlas,  279,  285-287 

B  .5000,  267-268 

BTSS,  292-.300 

CDC  6600,  472,  475 

DS25,  4.50-455 

Lehman  computer,  461-463 
operation.  616,  632-635 

D.  626 

K,  624-625 

M,  620-622 

P,  626-627 

port,  627-628 


operation,  S,  623 

T,  625-626 
operation-code-size,  processor,  626-627 
operation-expression,  631-635 
operation-modifier/!     }.  .30-32.  631-632 
operation-rate,  port,  617-618 
operation-rate-set,  617 
operation-set,  617 
operation-time,  19 
operator  syllable,  B  .50(K),  272 
optimum  coding,  193.  199 
optional  expression.  607,  613 


P/processor,  17 

(See  also  processor) 
P(l  address)  (.see  instruction  format) 
P(2  address)  (see  instruction  format) 
P(3  address)  {see  instruction  format) 
P(n  4-  1  address)  {see  instruction  format) 
P(arrav),  66 

P(array;  ILLIAC  IV),  320-.333 
P(array;  NOVA),  318-319 
P(c,  central  processor),  17-22,  71 
P(display),  72 
P(io).  72,  .303-.304 

Piio:  analog^'digital;  IBM  1800).  405.  409-416 
P(language),  63,  73,  257 

P(microprogram)/microprogram  processor,  61, 
71,  334 

P(microprogram;  SD-2),  341-.347 
P(microprogram;  System/36(),  Model  .30), 
385,  388 

P(  microprogram;  Wilkes  example),  335-340 
P(special  algorithm).  66,  72-73,  301 
P(stack)  (see  instniction  format) 
P(yector  move),  72 
P-concurrency,  627-628 
packaging: 

CDC  6600,  494-496 

HP  9100A,  250,  252-253 

Pegasus,  174-176,  179-182 

SD-2,  .341 -.343 

Stretch,  432,  4.38-4.39 

Whirlwind  1.  141-143 
page: 

address,  120-1.34 

(See  also  memory  mapping;  multiprogram- 
ming) 

Atlas,  274,  276,  279-283 

BTSS,  291 

mapping,  79-80 
page  address  register,  .'Vtlas,  279-283 
parallel  arithmetic  (see  arithmetic) 
parallel-by-hmction.  CDC  6600.  491-194 
parallel  processing.  446.  456—169 
parallel  programs"  IPL  VI.  .359-360 
parallelism,  4.56 

parameter.  19.  611  (see  attribute) 
parameter-set.  611 


666  Subject  index 


parentheses  (    ),  609 
performance,  37,  49-52 

CDC  6600,  470-471 

Lehman  example,  456-457,  463-469 

PILOT.  440-442 

Stretch,  421-423,  425-426,  431-433 
UNIVAC,  164-168 
period  .,  25,  609,  614 

peripheral  and  control  processors,  CDC  6600, 

471-475,  489-491 
permanency: 

M,  620-622 

S,  623 
phrase-name,  613-614 
physical  address,  76-81 

BTSS,  291 

(See  also  memory  mapping;  multiprogram- 
ming) 

pipeline  processor.  84-85 
PMS  conventions.  615-628 
PMS  diagram,  16-22 
PMS  level.  9-10,  1.5-22 

ACE,  191,  193,  198 

AGC,  146-148 

ARPA  network,  511 

Atlas,  277,  279-283,  289-290 

B  5000,  258-260,  268 

BTSS,  275,  292 

CDC  6600.  470.  471-475,  476,  489-494 

ComLogNet,  509-510 

computer  models,  63-66 

D825  and  D83().  260,  450-451.  453^55 

Deuce.  191 

EULER,  382-392 

FORTRAN  machine,  365-366 

HP  9100A,  235,  249-254 

IBM  701.  515 

IBM  1401.  226 

IBM  1800,  4(K)-405,  404 

IBM  7094,  517,  518,  519 

IBM  ASP,  506 

IBM  Svstem/360.  563,  579-587.  602-606 
ILLIAC  IV,  321-322,  327-329 
KDF  9,  260 

Lehman  Computer,  459-461 

LGP-.3()1LGP-21,  217 

M.I.T.  network.  507 

networks,  504,  505-512 

PDP-8,  20-21,  121,  123-131,  124,  126-128 

PILOT,  398,  440-142 

pipeline  processor.  84 

Programma  101.  237,  237-238 

RW-40,  RW-4()0.  471,  477-480,  482-485 

SD-2,  .343 

SDS  900  series,  275,  543.  546,  546-548 

Stretch,  421-423,  425-426 

S/switches,  67-69 

Texas,  University,  506-507 

UNIVAC,  158 

UNIV.\C  1108,  11 

Whirlwind  I.  90,  1.38-139 


PMS  notation.  19-22 
PMS  primitives.  16-22 
PMS  structure.  41 
PMS  structure  dimensions,  63-85 
polar  coordinate  arithmetic,  246,  255-256 
Polish  notation,  270-271,  391 
port,  16-18,  617-618 
port-to-port  delay,  L.  620 
portability:  M.  620-622 
T.  625 

postulation,  indefinite-expression,  610 
power,  616-617,  619 
power  supply:  Pegasus,  181 

UNIVAC,  163 
precision,  data-type,  629-631 
primary  computer,  PILOT,  441—143 
primary  memory  [see  M(core),  Mp-concurrency] 
primitive-name.  613-614 
print  column.  617 
process,  BTSS,  293-297 

process  control  computer,  IBM  1800,  399-420 
process  map,  BTSS.  293 
processing  elements,  ILLIAC  IV,  .321-322 
processor,  626-628 

address-per-instruction,  627 

address-size,  626-627 

algorithm-encoding-efficiency,  626-627 

concurrency,  41,  83-85,  626-627 

data-types.  626-628 

encoding-efficiency,  626-627 

fimction,  626 

instruction-memory,  627 

instruction-size,  626-627 

intermpt-response-tinie,  626-627 

ISP,  635-637 

Mp-concurrency,  627-628  ■ 
operation-code  size,  626-627 
P-concurrency,  627 
parallel/parallel-by-word,  83-84 
program-switching-time,  626-627 
serial,  83 

(See  also  P/processor) 
processor  state,  24,  57-63 
program  checking,  Pegasus,  178 
program  coimter.  Whirlwind,  140 
program  entry  mode,  desk  calculator,  235 
program  field  register,  120 
program  level,  8-10 

program  reference  table,  B  5000,  271-272 
program-switching-time,  626-627 
progiammed  operator,  SDS  900  series,  542, 
'  544-545,  550 

(See  also  extra  codes) 
programming  criteria,  D825,  448 
proper-name,  607-617 
protection  and  relocation  registers,  80 
PSW/program  status  word  {see  processor  state) 
punched  card  [.see  M(punched  card)] 
push-pop  instruction,  DEC  .338,  .308-309 

(See  also  stack) 
pyramid,  CDC  6600,  474 


quantity,  608,  615 

queue  memory  [see  M  (queue)] 

quit  instruction,  457 

random  access  memory  [see  M(random)] 
range—,  indefinite  expression,  19,  610 
readability,  618 
real  address,  76-81 

(See  also  physical  address) 
record,  617 

recursive  procedure,  EULER,  383-384 
referent,  data-type.  16.  629-630 
referent-expression,  data-type.  629-630 
register.  632 

register  transfer  (see  RT) 
relation.  608 

relational-arithmetic-operations 

=  /<><>,  608-609,  634 
relational-expression,  615 
relational-i-unit-operations.  633-634 
relational-operation.  615,  634 
relations.  615 
reliability.  617-618 

HP  9l'(K)A.  2.53 

ILLIAC  IV,  328-329 

Lehman  computer,  456-457 

network,  505 

Pegasus,  181-182 

UNIVAC,  166-168 

Whirlwind,  138-139 
relocation  registers,  80 

(See  also  memory  mapping:  multiprogram- 
ming) 
renaming,  632 
repeat  instruction,  207 

NOVA,  316 
replicated  single-computer  systems,  448 
resource  allocation  diagram,  10 
resume  instruction,  4.58 
reverse  polish,  262-263 
round-off,  104-107 
RT/register  transfer  level,  5-7 

Atla,s,  287-289 

CDC  6600,  491-194 

FORTRAN  Machine,  364-368,  375-381 

HP  9100A,  250 

IBM  1401,  229-2.30 

IBM  1800,  405-409,  411-413 

IBM  7094,  520-522 

ILLIAC  IV,  326 

IPL  VI,  352-.3.54 

KDF  9,  264 

NOVA,  318 

PDP-8,  12.5,  127-133 

SD-2,  .34.3-345 

SDS  900  series,  .550-552 

Stretch,  426-431 

UNIV.AC,  157-160 

Wilkes  example,  3.36 

(See  also  logical  design  level;  microprogram) 


Subject  index  667 


S(crossbar;  Mp-Pc:  Lehman  computer),  461 
S(  cross-point  I,  67-7(1 
S(cross-point;  B.5()(X)),  2.58,  267-268 
S(cross-point;  D82.5),  450-454 
S(cross-point;  non-hierarchv;  R\\'-4(M)),  478- 
480 

S(duplex),  66-69 
S(hierarchy)  67-70 

S( Inter-memory  transfer  tnmk;  PILOT),  443 

S(non-hierarchy),  68-69 

S/sec/seconds,  616 

S(simplex),  66-69 

S„  switch,  17-22,  41,  66-70 

(See  also  switch) 
S(Telephone  exchange),  .506 
S(trunk;  CDC  6600),  49.3 
scientific  computer  (see  function) 
scoreboard,  CDC  66f)(),  473,  492 
scratch-pad  memory,  58 
secondary  computer,  PILOT,  44.3—444 
segmentation,  77-81 

(See  also  memory  mapping;  niultiproi;ram- 
ming) 
Selectron  memory,  95 
semantics,  607-608 
semi-colon  ;,  611 
sense  amplifiers,  128-1.30 
sequencing  (see  interpretation-cycle) 
sequential  circuits,  5 
serial  arithmetic,  428—129 
serial  computer  microprogramming,  .340 
set,  607,  611 

shared  niemor)-  scheme,  D825.  448-449 
sharing  networks,  .504-.505 
sign,  614 

simple-computer,  628 
simple-link,  619-620 
simple-menior\ ,  620-621 
simple-name,  607,  613-614 
simplex,  617-618 
simulation: 

digital  computers  (See  also  emulation;  inter- 
pretation-cycle) 

Lehman  computer,  463-469 
single  data  stream,  83-84 
single  instruction  stream.  83-84 
slave  memory  (see  look-aside) 
SLT  Solid  Logic  Technology-.  5&4.  603-604 
small  letters,  609 
source  address,  ACE,  194-199 
space,  SD-2,  341 
space  „,  25,  607 
spacing,  609 

specialization,  indefinite  expression,  610 
split  instruction.  457 
square  root  instruction,  241 
stack: 

B  5000,  260-261,  269-271 
DEC  .338,  308-.309 
EULER,  .385 
KDF  9,  260-261 


stack  memory,  73 

[See  aLw  instniction  format;  M(stack)] 
stack  switch.  623-624 
state,  616 
state  diagram: 

ISP,  29 
(See  aim  interpreter) 

PDP-8,  131 
state-system  level,  7,  1.5-16 
staticizor,  Pegasus,  174 
step,  631-632 

storage  protection  (see  memory  mapping;  mem- 
ory protection;  multiprogramming! 
store  and  forward  network,  504 
stored  program  digital  computer  (see  computer) 
string,  613 

stnicture,  37-.38,  52-85 

computer,  628 

hierarchy,  6.3-70 

lattice,  65 

tree,  65 
subcomponents,  617 

subroutine  calling  instructions,  PDP-8,  123.  1.35 

subroutine  file,  BTSS,  299-300 

subscripts  (see  base  register:  index  register) 

subscripts,  J,,  609 

subtraction,  99-l(K) 

superscripts  t,  609 

S\'C  Supervisor  Call,  597 

switch,  41,  66-70,  623-624 

concurrency-t\pe,  623-624 ' 

control-terminal,  70 

distribution,  623 

hang-up-delay,  623-624 

hierarchy,  623 

location,  623 

pennanenc\'.  623-624 

processor-control,  69-70 

processor-memory,  66-69 

(Set'  also  S,  switch) 
sw  itch-tvpe.  623 
syllable:' 

B5m).  272 

KDF  9,  263 
synchronizer,  IBM,  518-519  (see  controls) 
syntax,  607,  609 

s\spop  s\'steni  programmed  operator.  BTSS, 
292  ■ 

system  level,  3-4 

T(CRT;  DEC  .3381,  305 
T(CRT;  HP  mOOA).  24.3,  251 
T(CRT:  R\V-400),  4.S4 

T( keyboard;  HP  910(j.\),  235,  244-249,  252-253 
T(keyboard  to  tape:  Unityper),  161-162 
T(punch;  carclpaper  tape),  580 
T(tape  to  print:  I'niprinter),  161-162 
t/time.  616 

T/transducer  terminal,  17 
(See  also  transducer) 


table  look-up: 

IBM  6.5f);  220,  222 

ZEBR-'^,  204 
task,  Lehman  example,  456-458,  461-463 
technology,  5.3-55,  617-618 

M  (see  memory,  technology) 

T,  625 
Teletype,  126 
temperature,  616-617,  619 
terminate  instniction,  457-458 
test  and  set  instniction,  458 
test  control,  Whiriwind,  142-143 
tetrads,  112 

three  addresses  per  instniction,  193-194 

(See  also  instruction  format) 
time,  616 
time  chart,  4.3-46 

time  sharing  computer  (see  function;  multipro- 
gramming) 
timer,  IBM  18(K),  411 
transducer,  625-626 

divergence,  625-626 

technolog)',  625 
transduction,  T,  625-626 
transduction-technolog) .  625 
transmission-operation,  633 
transmit  ^,  23-24,  631-633 
trap: 

IBM  7094,  515.  522-524.  526.  .532,  .536,  .541 
ILLIAC  IV,  .325 

(Sec  also  intraprocess  internipt  trapl 
tree  (see  structure,  tree) 
trouble  shooting  (see  maintenance) 
tum-around-time,  618 

twin  mode  instructions,  SDS  900  series,  .543-545 


unary-arithmetic-operation,  614,  633-635 

unary-boolean-operation,  615,  633-635 

unar\'  operation,  28,  633 

unary-vector-operation,  633-634 

unit,'  60S,  615 

units,  general.  616 

user  mode,  BTSS,  291-297 


value.  19.  607,  611-613 
variable.  607,  609-610 

variable-character-\ari;ible  length  strings,  414 
variable-length  character  string,  184-185,  224 

B  .5000.  268-269,  272-273 

EULER,  ,383,  388-.391 
variable  structure  network,  .504 
vector  display  instructions,  .308 
virtual  memory /virtual  address.  77-80 

(See  also  memory"  mapping;  multiprogram- 
ming) 

virtual  table  of  contents,  14 
voltage.  616 
volume,  617,  619 


668  Subject  index 


w/word,  617  word  length,  CDC  6600.  489,  492  word  size,  40 

wait  instruction,  458  PILOT,  4A2-A4A  writability,  618 

weight,  617,  619  Stretch,  414-421 

Wideband  Communication  Center,  507  (See  also  performance) 

wiring  {see  packaging)  Whirlwind,  137  x-list,  611 

word/w,  617  (See  abo  data-type;  design  philosophy)  x-name,  614 

word  length,  56-57  word  mark  character,  IBM  1401,  226  x-set,  611 
AGC,  146,  148-152 


I        alternation  17,  610 

:  =       assignment,  23,  607,  609 

;        semicolon,  611 

:       colon,  19,  612-613,  631 

,       comma,  611 

<-      transmit,  23-24,  631-633 

— >       evoke  operation,  23,  631,  633 

=  @  D  V  A  — I       boolean-operation,  608- 

609,  633-635 
=  ^  634 

=  7^  <  >  <  >  relational-arithmetic 

operations,  608-609,  634 
?  611 


+  —        unanj-arithmetic-operation.  614-615, 
633-635 

X  /       binanj-arithmetic-operations.  614-615, 

633-635 
~       range,  19,  610 
t       superscript,  609 
I       subscript.  609 
□        concatenation,  24,  631- 

633 

/       abbreviatiot^.  19,  607,  609 
space,  25,  607 
period.  25,  609,  614 
$       609,  616 


#  index,  20,  613 
607,  613 
name.  607,  617 
609 

0        null.  607,  613 

*  615 
IX  615 

(    )       parentheses,  609 
[    ]       address-range,  24,  631-633 
{    }       operation-modifier,  30-32,  631-632 
<    >       element  range.  24,  631-633 
ellipsis.  608,  610 


f 


About  the  Authors 

c.  GORDON  BELL  received  his  S.B.  (1956)  and 
his  S.M.  (1 957)  from  the  Massachusetts 
Institute  of  Technology.  During  his  career, 
Professor  Bell  has  been  active  both  in 
industry  and  in  education.  While  working 
for  the  Digital  Equipment  Corporation 
(DEC),  as  manager  of  computer  design,  he 
was  responsible  for  the  design  of  the 
PDP-6,  the  first  commercially  available 
time-sharing  computer,  the  DEC  PDP-4, 
and  the  PDP-5.  He  has  been  a  consultant  to 
DEC  since  1966,  participating  in  the  design 
of  the  PDP-1 1 .  Author  of  numerous  pub- 
lished papers,  Professor  Bell  is  a  former 
Fulbright  Scholar  and  a  member  of  the 
National  Science  Foundation  COSINE  Task 
Force  for  developing  an  undergraduate 
Computer  Engineering  Option  for  Electrical 
Engineering  (1969).  He  was  a  research 
scientist  doing  speech  analysis  by  com- 
puter at  M.l.T.  and  is  currently  Professor  of 
Computer  Science  and  Electrical  Engineer- 
ing at  Carnegie-Mellon  University. 

ALLEN  NEWELL  obtained  his  undergraduate 
degree  at  Stanford  University  (1949)  and 
his  graduate  degree  (Ph.D.,  1 957)  at  Car- 
negie Institute  of  Technology.  Employed  as 
a  research  scientist  at  the  RAND  Corpora- 
tion from  1950-1961 ,  Dr.  Newell  was  in- 
volved in  the  laboratory  study  of  formal 
human  organizations,  using  simulated 
environments,  a  project  which  led  to  the 
concept  of  system  training  of  large  organi- 
zations. Since  1955.  he  has  been  involved 
in  research  on  artificial  intelligence,  pro- 
gramming systems  and  the  psychology  of 
human  thinking.  With  J.  C.  Shaw  and  H.  A. 
Simon  he  developed  the  first  list  process- 
ing systems.  He  is  the  co-author  of  two 
books,  one  in  the  area  of  list  processing, 
the  other  in  the  area  of  problem  solving 
programs.  Dr.  Newell  is  presently  Univer- 
sity Professor  at  Carnegie-Mellon 
University. 


OTHER  McGRAW-HILL  BOOKS 


INTRODUCTION  TO  APPLIED 
COMBINATORIAL  MATHEMATICS 

c.  L.  LIU,  Massachijssris  institute  of  Technology. 
McGraw-Hill  Com.ci.  ler  Science  Series. 
393  pages 

Four  aspects  of  combinatorial  mathematics  are 
covered  in  this  text:  eriumerative  analysis,  theory  of 
graphs,  optimization  techniques,  and  design  of 
experiments.  The  discussion  combines  fundamental 
theory  and  modern  applications,  with  basic  con- 
cepts unencumbered  by  excessive  mathematical 
detail.  No  prior  background  in  the  subject  or  in 
modern  algebra  is  required,  and  numerous  worked- 
out  examples,  which  are  graded  in  increasing  diffi- 
culty, are  offered  to  lead  the  student  into  various 
topics. 

PROGRAMMING  LANGUAGES,  INFORMATION 
STRUCTURES  AND  MACHINE  ORGANIZATION 

PETER  WEGNER,  Cornell  University. 
McGraw-Hill  Computer  Science  Series.  401  pages 
Working  within  a  unified  framework,  the  author 
begins  from  the  notion  of  an  information  structure 
as  well  as  from  the  idea  that  a  computation  consists 
of  a  sequence  of  information  structures  which  are 
generated  from  an  initial  representation  by  the 
execution  of  a  sequence  of  instructions.  The  struc- 
tures which  arise  during  execution  of  programs  in  a 
number  of  existing  programming  languages  are 
analyzed  in  detail.  The  text  covers  assemblers, 
macros,  multi-programming  systems,  and  simulation 
languages. 


NUMERICAL  CALCULATIONS  AND 
ALGORITHMS 

ROYCE  BECKETT  and  JAMES  HURT,  both  of  the  University 
of  Iowa.  McGraw-Hill  Series  in  Information  Process- 
ing and  Computers.  298  pages 
Essentially,  this  text  teaches  the  use  of  digital  com- 
puters in  the  solving  of  engineering  problems.  While 
presenting  a  variety  of  numerical  methods  for 
solving  typical  problems  on  the  computer,  this  text 
does  not  concentrate  on  a  particular  computer. 
Rather,  it  stresses  the  general  techniques  of  prob- 
lem-solving with  solution  outlines  in  the  form  of  flow 
charts.  A  computer  program  can  then  be  written 
from  these  charts  for  any  of  the  problem-oriented 
algebraic  languages. 

SWITCHING  AND  FINITE  AUTOMATA  THEORY 

zvi  KOHAVi,  Massachusetts  Institute  of  Technology. 
McGraw-Hill  Computer  Science  Series.  500  pages 
Many  topics  concerned  with  the  diagnosis,  struc- 
ture, and  capabilities  of  logical  machines  are  in- 
cluded for  the  first  time  in  this  general  text.  This 
new  material  constitutes  about  half  of  the  entire 
book.  The  author's  purpose  is  to  develop  topics 
which  are  the  product  of  recent  research  and  are  of 
practical  interest  to  the  logical  designer  and  the 
computer  scientist.  He  provides  the  student  and  the 
engineer  with  specific  techniques  for  designing 
logical  circuits,  techniques  for  minimizing  or  decom- 
posing combinational  as  well  as  sequential  circuits, 
etc. 


McGraw-Hill  Book  Company 

Serving  Man's  Need  for  Knowledge® 
330  West  42nd  Street,  New  York,  N.Y.  10036 


